HAI sat down with one of our partners, Preservica, for a conversation about digitization vs digital preservation. Alaine Behler, Preservica’s Vice President of Marketing for North America, spoke with us about the difference between digitization and digital preservation.
Q: Thanks for joining us, Alaine. Can you tell us about Preservica? When did the company get started with digital preservation and why?
A: Thank you for inviting me to this conversation. Preservica has been an industry leader in raising awareness of digital preservation – what it is and why it’s important – for nearly fifteen years.
Our company started as a project back in 2003 with the UK National Archives as a public sector research project that resulted in the creation of PRONOM, a searchable public technical registry, and DROID, a file format identification tool – both of which are still used today.
Through active collaboration in US and EU projects over the following decade, our workflow-driven active preservation solution evolved, leading to the launch of our ‘Preservation as a Software-Service’ or SaaS offering in 2012 and finally our Free for everyone up to 5GB version, Preservica Starter in 2021.
Today, our software is used by more than 200 organizations, including 26 US state archives, 16 national archives and libraries, and dozens of government agencies around the world. Our client base also includes academic institutions, major corporations, museums and more.
Q: Some of our audience may be digitizing materials in an effort to preserve content that is fragile or inaccessible for stakeholders inside and outside their organization. What is the difference between digitization and digital preservation?
A: This is one of my favorite questions. There is a difference between digitization and digital preservation.
Digitization is a conversion process. It is converting analog materials (such as a photographs, handwritten letters, film, slides, and more) to a digital format. The output of that conversion (or digitization) process is called a digital preservation file (for example a TIFF file). These files are quite large and include the image itself and related metadata captured in the digitization process. Because the preservation file is so large and valuable, most organizations create derivative files from the preservation files (for example, a JPEG or PDF file) for daily use and/or web publishing. They then store the preservation copy in a separate place in case they need it again.
Digital preservation, by comparison, is an active process of safeguarding all digital content, regardless of how it was created (born-digital and digitization). If content has been digitized, digital preservation ensures the preservation file and derivatives are never corrupted and ensures they can be accessed and used 30 or 40 years from now or indefinitely – even if file formats and technology change. Some of the work that digital preservation is doing includes:
- Providing quick access to information when needed
- Migrating file formats to replace outdated formats with modern versions so that content is always accessible
- Providing checksums and audit trails to ensure authenticity and trustworthiness of documents
- Conducting active integrity checks and self-healing to prevent corruption
- Storing multiple copies across multiple locations to ensure information isn’t lost
- Automating large-scale ingest of content for efficiency
The simplest way to explain the difference is that digitization is one small part of digital preservation. If you digitize your analog items, it is a good first step- but having a repository of outdated or corrupted files is not going to help you find information in 30-50 years. I like to use the example of a Word Perfect file. Who has Word Perfect on their computers today? How would you open that document? If you took the step of digital preservation with that Word Perfect file, you would still have the original file in your Preservica database, but you would be able to open and view the file in today’s format. This is likely Microsoft Word since that is our current standard for documents. Who knows what it will be tomorrow, but Preservica will continue to support technology upgrades to ensure the file is updated as necessary – it’s an active process for the lifetime of the document or file.
Q: Why is it important to understand the difference between digitization and digital preservation?
A: It’s important to understand that digitization does not offer security in perpetuity. I think a lot of organizations get a false sense of security from digitization projects, not realizing that they are only taking the first step towards preservation. Organizations invest time and resources to complete a digitization project, and they should protect that investment from the outset. All digital objects, regardless of how they are generated (born-digital or digitized), can be lost and/or the lifespan of the object greatly reduced if the overall lifecycle of digital records is not considered. The cost of lost data is two-fold:
- (1) waisted resources (time and money) spent to generate the assets to begin with, and,
- (2) perhaps more importantly, the loss of critical information.
It is so important to preserve your files.
Q: Is storing files in SharePoint, on a hard drive, or even in the cloud, the same as digital preservation?
A: That’s a great question. No, they are not the same. Digital preservation is future – proofing digital objects to ensure content today is still readable 10 years from now or indefinitely. It is an active process of converting formats, fixity checking, and more, to ensure that information is reliable and always accessible.
Simply storing files in a tool such as SharePoint or on a hard drive does not ensure it will be available down the road.
Q: What can organizations do to protect their investment in digitizing materials?
A: Ask yourself the question- how will I access this in 50 years?
To ensure digital objects are available when they are needed now and in the future, custodians should start with digital preservation as the foundational element to their program. All other elements should feed into the digital preservation plan.
Q: How can organizations start to explore digital preservation?
There are a lot of great resources to learn more about digital preservation:
- Additional reading on the topic of digitization vs. digital preservation can be found here:
- Read up on OAIS, as the international standard for digital preservation.
- Review current recommendations on how to evaluate digital preservation systems to ensure you are making the right investment – Charter for Long-Term Digital Preservation Sustainability
- Explore digital preservation on your own. Preservica offers a free digital preservation tool called Starter, that will give you up to 5GB of storage to try digital preservation for yourself. We have many organizations that use Starter for their digital preservation needs – click here to sign up for a Preservica Starter.