The Department of Natural and Cultural Resources building in North Carolina

Web Site Archives

Author: Anna Henrichsen, State Archives of North Carolina

North Carolina maintains an archive of government websites and social media pages

This program is run in partnership by the State Archives of North Carolina and the State Library of North Carolina. Together we have a team that is dedicated to making sure that websites and social media pages created by government entities are preserved and accessible for years to come. 

State Archives and State Library

The State Library and the State Archives are sister agencies under the umbrella of the North Carolina Department of Natural and Cultural Resources. Like many other states, DNCR has a legal mandate to preserve state government information in any format and make it accessible to the citizens of North Carolina. One of the major differences between the Archives and Library is the type of materials we focus on maintaining.

The State Archives focuses on preserving records. Per General Statute 121, the purpose of the Archives is to collect, preserve, and provide access to public records as well as historically significant archival materials relating to North Carolina. The reason we collect government websites and social media accounts is because they count as public records – General Statute 132 defines public records as all documents, regardless of physical form or characteristics, made or received in the transaction of public business in North Carolina – meaning born digital records, such as websites, are included.

The Government Heritage Library (a section of the State Library) acts as the official permanent depository for all North Carolina state publications. General Statute 125 mandated that GHL collect, preserve, and provide access to state publications, which are defined as any document prepared, printed, and published, regardless of format. Meaning that over time, the library’s collection has moved from solely print, to a hybrid collection of titles in print and digital, websites included!

Because state agency websites and social media accounts house and create both publications and original records, both of us work together to capture this data. This work is done through the

North Carolina State Government Web Site Archive and Access Program

How web archiving works

The two main services we rely upon to capture website and social media data are Archive-It and ArchiveSocial. We use two services because each functions a little differently and has different strengths.

Archive-It 

You may recognize the name because it is a service built by the Internet Archive, who are the folks responsible for the Wayback Machine. Archive-It is basically like a Wayback Machine specifically dedicated to capturing the websites of cultural heritage institutions. We tell Archive-It which websites we want to capture and how often and it does all the work of crawling the sites and providing interactive captures of how sites looked at different points in time.

We use Archive-It primarily to capture websites, although it can capture some information about social media too (basically anything that is public facing). Some examples of what we capture include state agency websites, board and commission meeting minutes, publications only available on the web, governmental policies, historically significant photographs and videos, state-wide statistics, social media sites such as Facebook and Twitter, and agency blogs. As of January 2020, we have 24.3 TB of data archived, dating back to 2005 when we started using Archive-It.

ArchiveSocial

ArchiveSocial is a bit more specialized. It continuously captures posts made from state agency accounts in Facebook, Twitter, Flickr, Instagram, YouTube, and LinkedIn. Its main draw is that it not only captures public posts including things like comments on those posts, but it also captures private and direct messages sent or received by these accounts. We started using ArchiveSocial in 2012 and as of January 2020, we have 173 social media accounts currently in our archive (101 active accounts and 72 historical accounts).

This archive supplements, but does not replace, the social media account posts captured by Archive-It. Consequently, there is overlap between the two. To receive the best results, you should search both archives if you are looking for the records of a social media account. We have a limited number of accounts we can put into ArchiveSocial, so we try to identify accounts that are very active or often receive direct messages to make sure that information is correctly captured.

How do I search the web archives?

The content collected by Archive-It and ArchiveSocial is available 24/7 via our website.

There are two search bars on our home page – one to search Archive-It and one to search ArchiveSocial. Once you start a search in either platform, you will be taken to a page where you can view your results and if necessary, narrow your search parameters or start a new search.

A basic Archive-It search will look for your keyword across the metadata collected for each website in the archive and organize your results by individual site. If a site interests you, you can click it and explore how that website looked on the different dates it was captured.

ArchiveSocial looks a little different. A basic search will look for your keyword across all the different social media posts in ArchiveSocial and organize them by relevance. You can click individual posts to see slightly more information about a particular record, such as the account that posted it, the type of content it is, and when it was posted.

Tutorials

Beyond basic searches, each platform allows for advanced searches and the ability to narrow results by filtering on different metadata fields. If you are interested in learning more tips and tricks about searching the web archives, the State Library created several YouTube tutorials that you may find helpful:

Introduction to the Web Archives
Searching Archive-It
Searching ArchiveSocial

Questions

You can always reach out to the State Library or the State Archives with questions, whether it’s about web archiving generally or something specific about the Web Site Archives and Access Program. We are passionate about preserving and providing access to the history of North Carolina – websites included! – and are happy to assist you in any way we can.

A few members of the web archiving team:
Andrea Green, Digital Collections Manager, State Library 
Anna Henrichsen, Information Management Archivist, State Archives
Krista Sorenson, Digital Projects Librarian, State Library
Camille Tyndall Watson, Digital Services Section Head, State Archives

Related Topics: