I am caught in a debate re digitizing and digital preservation. I maintain that you can digitize but not be practicing digital preservation. Digitizing encompasses creating digital information. Digital Preservation is required for maintaining the digitized information based on short or long term requirements. This would include activities such as refreshing, migrating, replication etc. Digital preservation is more than having your documents stored in a system. Do you have a digital preservation program? Is Digital Preservation a function of Library/Archives/IT? What are the requirements and how is it managed?

One typical example of this distinction is the use of PDF vs. PDF/A formatting for documents. We use PDF for working documents, but convert to PDF/A for documents flagged for preservation. The PDF/A format locks the file, so it can’t be edited.

On a more practical level, we use a cloud-based records management system. Our retention schedule defines the retention periods for the different types of records that we store in the system. Since the records stored there tend to be working files, everything uploaded there that falls within our retention schedule gets copied over to S3 storage in AWS. That way, if someone deletes a file from our records management system, the file is preserved in S3. That storage is replicated between two geographically separated Amazon data centers (this part isn’t a function of digital preservation as much as it is of disaster recovery, should the primary data center go down). We also archive emails and other critical business files from our network there.

Most of this is managed through our regular business processes, with a lot of automation backing it up. Retention schedules typically kick in based on the status of an item (an employee is terminated, or a job is completed, etc.). We generate department-level reports annually for documents that are up for retention. The reports go through an approval process, and once approved, an automated process kicks in to delete the documents from our records system, and then from the archive.

We manage all of this through IT, with input from key stakeholders within each department.


Sounds good to me. Check the NARA website for additional information and references. There also is a boat load of vendor information on the topic.

Another phrase that means different things to folks like backup and archive. The general worry is the ability to read electronically stored information (ESI) that has to be (I really mean must be) retained over 20 years. Intellectual Property data used to file for and defend a patent may be in various formats from R&D that are challenging to maintain. Electronic Laboratory Notebooks (ELN) are great for accessing and sharing data quickly but are made of many components and formats linked in a ELN “dashboard”. There are companies that specialize in this work (Preservica come to mind) but the preservation must be worth the cost. Not all is open to pdf/a.

In my organization, digital preservation is part of the records management program but not a program unto itself. I am drafting requirements for my organization on digital archiving formats and the records program already has a policy on digitizing paper records (although it needs updating). My agency also follows state requirements for the storage of electronic records.

I agree with your stance that scanning a document isn’t the same as digital preservation. Just like putting documents in a filing cabinet doesn’t mean someone can claim they archived the records. With digital preservations you have to consider file formats, compression rates, the longevity of hardware and software, scanning requirements, storage requirements, metadata, the records original format (digital or analog), the record’s retention requirements, and more. It also doesn’t stop at just documents but could include audio, video, film, photographs, and CAD drawings; and each of these object have their own specific needs. Unlike traditional archiving, there are a number of differing views on file formats and how to manage the digital records to ensure they remain assessable. A good digital preservation process requires a lot of research, consideration of the organization’s needs, documentation, and training. So keep up the debate, it’s worth the time.

I’ve found a number of useful sites while doing my own research. Let me know if you’re interested.

• Library of Congress:
• Sustainability of Digital Formats: Planning for Library of Congress Collection:
• Preservation Self-Assessment Program:
• National Film Preservation Foundation:
• Federal Agencies Digital Guidelines Initiative:
• National Digital Stewardship alliance:
• British Library Digital Preservation:
• University of Massachusetts Amherst Libraries:
• Digital Preservation Coalition:
• Open Preservation Foundation:
• Digital Curation Center:
• Smithsonian Institution Archives:
• National Archives:
• Association for Recorded Sound Collections:
• Northeast Document Conservation Center:


I have learned over the years that one way to prepare documents for a future digital transferring and preservation is to capture them in a standard format. The most common suggested format is PDF. The documents should be maintained in an operating system.
This way the conversion of documents from one file format to another or from one operating system to another will be less burdensome.
External storage media like CDs and USB flash drives are not recommended.
There is a different case with other media assets like videos, photographs, drawings etc.

