Finding your way… among newspapers
Digitisation can be a lengthy process extending beyond the steps, funding, resources and agreements that are required to start it. Sometimes selecting the sources can be a complicated task in itself, especially if there are millions of newspapers to choose from, and the desire is to let emerging research outcomes inform the choice of titles to digitise. The needs of a flexible and radical collaboration research project are sometimes in practice not fully compatible with the digitisation planning needs, which require a careful and precise calculation of the workload, scope and timescales in advance.
Questions to balance include: How can the amount of material to digitise be calculated carefully? Ensuring that there is a balanced geographical area of coverage? That newspaper runs with a higher number of microfilms (when available) over hard copies are prioritised? How can we de-prioritise the titles with old microfilms or copyright issues? Or exclude those whose period of publication is outside the interest of the research project but at the same time ensure that the full run is digitised, wherever possible?
These and other questions can complicate the selection process. When variables and parameters of selection multiply and need to be applied at scale over hundreds of potential choices, a manual knowledge-driven choice can become difficult and unreliable. To overcome this problem we’ve approached it collaboratively. Firstly, we have documented all the requirements and the parameters needed to choose the newspapers from newspapers curators, historians and image technicians. Secondly, we have requested all the available metadata regarding the newspapers from the metadata team at the British Library. With these our research software developers have created a Jupyter Notebook that ordered and visualised the newspapers according to our priorities in order to support the selection.
That’s how the Press Picker tool was born. A future blog post will explain more about how we created and use it.