Posted by: bluesyemre | January 29, 2018

Modelling our digital archival data by The National Archives


The National Archives’ Digital Strategy (2017-19) identifies the challenges we face as we become a second-generation digital archive. The strategy envisions an archive that is ‘Digital by Design’, able to preserve and provide access to a wide range of rich digital records which better reflect the workings of a digital government. In particular, the strategy highlights the need for a new way of providing access to digital archival records.

The digital records we’ve received so far are more diverse than physical records. They are not just documents but can include all sorts of other content, from threaded discussions using a web-based tool to video, websites, structured datasets or even computer code. These records can be complex and often consist of different components, potentially with different creators and owners. And we know that ever more specialised formats will need to be added to our collection over the coming years.

However, these new digital objects are public records in the same way as the paper files they have replaced, and we need to make them at least as accessible as analogue records are now. In doing this, we cannot simply take the standards, processes and tools we use in the paper world and apply them in the digital world – the two are significantly different.

We have started exploring how we can provide access in a new way which has been designed from the outset to fit the way our users want to work with digital records today and will support how they may want to work with them in the future. A key difference is that, while we will still offer a ‘view’ of individual records for our ‘readers’, we must also make records available for computational analysis, enabling our ‘data users’ to work with records at scale and ask very different types of research questions.

At the same time, we will actively process the collection ourselves, to enrich descriptions and contextualise the records. This activity will produce information of a different size and shape to our traditional catalogue descriptions. For example, we can imagine contextualising records through their links to other resources, often held by other institutions, or through enriching descriptions to make the records more discoverable – or by applying probabilistic techniques that embrace the uncertainty that is typical of historical records.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.


%d bloggers like this: