A blog post by Kathryn Slover
What Is Web Archiving and How Do We Preserve A Website?
Web archiving is the process of collecting websites and the information that they contain from the World Wide Web, and preserving these in an archive. Web archiving is a similar process to traditional archiving of paper documents. The information is selected, stored, preserved and made available to the public. Websites are archived through web crawls, which capture digital snapshots of a website at the time of the crawl. The South Carolina Department of Archives and History (SCDAH) uses an automated web crawler operated by the Internet Archive to preserve state agency websites. The Internet Archive is a 501(c)(3) non-profit founded in 1996 with the purpose of offering permanent access to collections that exist in digital format. In fall of 2005, the Internet Archive began pilot testing their new subscription service, Archive-It, which allows institutions to build, manage and search their own unique web archives. The South Carolina Department of Archives and History began using Archive-It in 2007 to archive state agency websites. The South Carolina State Government Website Archives allows you to view South Carolina state agency websites from past dates, providing free and open access to the information long after the sites have changed on the live web.
What Do State Agency Website Archives Contain?
The South Carolina State Government Website Archives is composed of websites created by the state agencies of South Carolina’s government. These websites contain a number of digital assets including but not limited to: minutes taken at the meetings of Boards and Commissions; photographs and video of significant events in the state’s history; state-wide statistics; governmental policies; and other significant and related records. When seen as a whole, the South Carolina Government Website Archives documents the state's information and educational priorities, and also shows the development of the government's web presence over time.
How Do You Search the South Carolina State Government Website Archives?
There are two ways to search the South Carolina State Government Website Archives. The first is through the South Carolina Electronic Records Archive (SCERA). On the SCERA home page mouse over to the navigation bar on the right hand side labeled “In This Section.” Click on “Web Site Archives.” Under “Web Site Archives,” you can view an introduction to the South Carolina State Government Website Archives or click on one of the collections. View an example below.
Archived websites are separated into 4 collections or categories: (1) Governor, Cabinet Agencies, and Misc. Sites, (2) State Agencies A-L, (3) State Agencies M-Z, and (4) South Carolina Redistricting Websites, 2001-2011. SCERA provides a list of state agency websites that has been captured since 2007. For example, if you click on the link for the Office of the Governor Website from 2011 to the present you will be able to see all the website captures from that time period. You can click on any of the dates to see what the Governor’s website looked like on that day and the information it provided.
The second way to search is through the Archive-It website. Archive-It allows you to narrow your search by subject, date, and keyword searches. This is helpful if you are looking to search the websites by subject rather than focusing on one single agency’s site. For example if you are looking for state agencies that deal with health in any capacity, you can simply type health in the search bar. Through Archive-It you can also search by collection or state agency. To search the Archive-It site for South Carolina state agency websites type South Carolina Department of Archives and History into the “Explore Collecting Organizations” search bar. The SCDAH organization page will come up and list the four collections available to search.
Why Is It Important To Crawl State Agency Websites?
Technology and website appearances have changed and will continue to change rapidly. In some cases websites can even lose data, both intentionally and accidentally. For example, Myspace, a popular social networking site in the 2000s lost 12 years of music uploads from its users in 2019 during a migration project. Websites like Facebook and Google have also either accidentally lost data or intentionally deleted it. The impermanence of the web has created a necessity to archive the information. In the case of state agency websites, archiving this material allows us to take a glance back long after the look of a website and information has been changed or removed. Websites are regularly updated and the information is constantly changing. It is essential to capture these records for preservation.
In addition to the information state agencies document, archiving this material is important because of its accessibility. Unlike traditional paper records we bring in at the Archives, this material is available to anyone with a computer and access to the internet. If someone is looking for Governor’s proclamations from 2007 and cannot make a trip to the Archives to look in the Governor’s papers, this information is available through captures of the Office of the Governor website.
The internet is a vital means of communication. In addition to reports, minutes, and policies, the state agency websites document leadership changes, current events in the state, agency initiatives, and other important information that might not be documented elsewhere. It is becoming more common for information to be published solely on the web. Websites are an essential part of how state government functions and document a state agency’s communication with the public. Through the Archive-It service, the SCDAH preserves and promotes access to the websites and records of all South Carolina state agencies.