Posted by: bluesyemre | July 17, 2017

How much #newspaper content is digitized? @SimonTanner


There is a lot of digitized newspaper out there in the world but it still seems to be a small fraction of the total newspaper collections worldwide (possibly less than 5% of all English language content by my guesswork). This blog highlights some sources and seeks more information on how much newspaper is out there and yet to be digitized?

A quick look at the Library of Congress Historic American Newspapers site shows 154,205 titles available and 12 million pages of searchable newspaper digitized. The British Newspaper Archive is showing 20 million pages at present.

And yet… I feel these digitized collections are still only a fragment of the newspaper resources that are out there to be digitized. They reflect the challenges of building a digitized corpus where there is so much printed material and so few resources for digitization. In the British Library alone there are approximately 450 million pages of printed material with roughly 18 million pages digitized (Tweet from Luke McKernan). If there is that much left to do at The British Library then how much else is there out there to do? The simple answer is we don’t really know and it is confounded by a number of issues.

These confounding issues are best highlighted in the European Newspaper Survey Report from the European Library / Europeana Foundation authored by Alistair Dunning in 2012. The report states:

Over half of the libraries (27 out of 47, 57%) have a cut off date beyond which they will not publish digitised newspapers on the web. Most frequently, this is based on a 70 year sliding scale, meaning that content after 1942 is inaccessible in digital form. 23% (11 out of 47) had an agreement with a rights organisation so that in-copyright digitised newspapers could be published. However, this tended to be restricted to individual titles rather than collective agreements for complete collections.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.


%d bloggers like this: