Ad or not? New crowdsourcing task

Written by Mia RidgeAugust 12, 2021Comments: 0

It’s not always easy for humans to recognise a snippet of text as being an advertisement, so it’s no wonder that software can struggle. ‘Ad or not‘ is a new task that asks you to help label snippets from 19th century newspapers.

As posted to the Zooniverse Talk pages, we want to know what a ‘machine’ was in the 19th century as part of our linguistic and historical research. We have an incredible dataset to work with, thanks to previous crowdsourcing tasks, but our analysis has hit a snag: our dataset is full of advertisements.

Because ads were often repeated in successive issues of a paper, machines that were heavily advertised are over-represented in our data. The technology that aims to identify them does not work very well on nineteenth century newspapers, so we need your help to identify the ads.

This screenshot from preliminary analysis shows why newspaper researchers need to know ‘ad or not’!

Preliminary grouping of machine descriptions

‘Ad or not’ asks you to look at an image and decide if the highlighted ‘machine’ is in an ad. We define ads as any paid notice, including short classifieds, listings and ‘advertorials’ designed to look like articles. If you’re not sure – guess!

A screenshot from our new ‘ad or not’ task

Have you tried the task? What questions do you have? Your feedback can be incredibly helpful, so we look forward to hearing your thoughts.

The results will immediately help our analysis of the results from our earlier crowdsourcing / digital volunteer project and we hope they’ll also work to train machine learning to identify adverts and eventually benefit all historical newspaper scholars.

Our Funder and Partners