The SMART Guide to Back-ups: Archiving

31 August 2010, by Ariadne Cedilla

What’s the purpose of archiving your electronic data?

Preservation. Simply put, archiving, in this sense, is preservation. Of memories (vacation photos, wedding videos, the Family Reunion Of Which You Do Not Speak Of, etc.) and data (like photographs, business documents, scripts, novels, music etc.) that are static and won’t change.

Revisions and retouching aside, archiving stabilizes and stores a set of information that you can use for reference, proof and perhaps most vital of all, connection.

  • Need the very first “final version” of the company handbook before the current set of revisions got too bulky? Here you go.
  • A lawsuit is threatening to come down on your company due to what the opposition claims is shoddy work? You pull out the pics to prove it just ain’t so.
  • A dispute rises over the contract? You got it in writing, and with the digital signatures all in place.
  • You have a lifetime’s worth of photographs. You want it to last beyond your lifetime, so your great-grands can get to know you even after you’re gone.
  • Want to show your kids that yes, Mama was a natural blonde before puberty hit and her hair got darker? Sure.
  • Your in-laws want to see the ultrasound? You want your as-yet-unbaked bun-in-the-oven to get to see it? You got it.
  • You have something you need to say, something you want the world to know. Share it across time.

Insurance (as back-up) is secondary (but a close second) in this case. If you’ve done your back ups according to plan, you’re covering those bases. Archiving is simply making sure that any data you have that’s important to you is stored, safely, uncorrupted, and available in a way/medium that will survive time and technological changes.

Think of the rate of change you’ve experienced just in your lifetime. There are people who have never experienced watching videotapes back when they were tapes. (Betamax, anyone?) There are people you know who have never experienced life without the Internet, and the Internet itself as we know it is barely a quarter-century old. And see what changes that phenomenon spurred in communication alone. Tech changes come in waves, but the need to keep important things safe is an age-old imperative. You only have visit museums to see that need expressed in material form.

So, back to being SMART (remember SMART?)

Preserve what e-data? Aside from the obvious static part (original versions, for example), basically, anything and everything you value and want to keep with you (or at the very least pass on for posterity and historical-cultural value, if nothing else.)

You’ve probably experienced one or more of the following scenarios:

  • The frustration of trying to decipher cryptic file names and labeling systems that made sense at the time. Back then, you were in a real hurry,it was convenient… and now you’re paying for your hastiness.
  • Giving up after you’ve forgotten the password to a VIP file. Of master passwords.
  • Having to surrender all hope of retrieving a file because the program used to create and open it is obsolete. Or the medium used as storage is obsolete, or was improperly stored and was damaged without you knowing.
  • Losing childhood memories because the tapes deteriorated while in storage, to the point that playing them would destroy them.

There is no one perfect set-up. Unless you have the wherewithal to have a climate controlled vault for all your treasures, you have to make do with what you’ve got and with the technology available, adjusting when the tech changes enough to warrant another transfer to a new medium. For now, anyway, the most affordable option for the man on the street are hard-drives and optical media to capture and store data.

How do you set it up?
The ideal archive set-up would have order built-in. Logical, concise, neat. Easily accessed and accessible — to a point. Some archivist argue that the best archives, once verified and checked corruption-free, should be left entirely alone to preserve the pristine security of the data. “That’s why you archive them. Seal them away, don’t touch. Use your other master-copy.”

There’s also the question of versioning — keeping copies that changed over time — as well as thinking of optimal physical separation and placement. An entire neighborhood can be affected by fire and flood. How far do you want to go to keep your archive safe? In another house, your neighbor’s, perhaps? In a bank vault? Would the master copies be separated by state? By continent? Don’t laugh, there are people making good money thinking of things like this.

Where would you store it?
Remember the 3-2-1 rule? 3 copies, 2 mediums, 1 copy off-site. Aside from extra hard drives, the simplest, easiest, cheapest storage medium for archiving electronic data are CD’s and DVD’s.

Simple copy means burning to CD/DVD. Get one of the better quality brands. With carrels coming to $20.00-30.00 for say, 100 pieces, you get data insurance for pennies. Pennies. People whose job it is to know about things like this will tell you to invest in quality recording optical media like Taiyo Yuden (touted as simply the best quality optical media for people who really want to make sure their data will remain safe and incorruptible for a very long time — with the proper care of course).

Who will you trust with your archive?
Where, and with whom will you put your precious data? You have a lot off options: In the bank, with a friend, out-of-state. If you’re really serious about it, in a climate controlled vault with positive air-flow, to keep the contaminants out. Or, you can store your optical data like the people at the Smithsonian do:

Their data is stored redundantly on DVD-R/DVD+R format optical media with a minimum of 2 offline copies: a preservation ‘master’ and a preservation ‘backup’. DVD-R/DVD+R are recognized “Write-Once-Read-Many” formats which ensures the integrity of the files placed on that disk.

An additional set may be created for reference/public access which may exist online, if permissions/rights issues and internal policies allow. The preservation master DVD disk is to be stored in appropriate offsite archival storage. Technical information regarding creation date of the disks and software used for disk creation is to accompany the disks.

Digitization projects with preservation master storage requirements over 250 gigabytes should consult with the SIA IT Archivist for preferred storage medium details.

Important!
No labels are to be affixed directly to the DVD disk. No markers are to be used to label the DVD disk. Both of these activities have been proven in national studies to degrade the archival life of the DVD.

DVD disks are to be stored in ANSI certified inert sleeves separately from paper and in an upright position. Labels are to be affixed to the outside of the sleeve to avoid direct contact with the disk.

For how long can you store your archive?
For CD’s and DVD’s, no one really knows. The best case proven scenario is 20-25 years, with evidence, both anecdotal and researched, that some of the better quality early 80’s CD’s are still going strong. The best hoped for is 50 years, but the technology of optical media quite simply isn’t that old, yet.

Issues: Obsolescence , unprepared for, can result in unreadable mediums and unreadable formats due to changes and updates in technological interfaces and hardware.

Some physical threats can be addressed with proper storage and handling, as identified concisely by this Digital Preservation Management Workshops and Tutorial. They also have recommendations for CD’s and DVD’s as preferred storage mediums. For an amusing break, visit the Chamber of Horrors: Obsolete and Endangered Media to see various computing storage media that went the way of the Dodo and the Great Auk.

Caring for Optical Media

In the December 2008 edition (PDF) of Communications of the ACM, the monthly magazine of the Association for Computing Machinery, Dr. Fran Berman, director of the San Diego Supercomputer Center (SDSC) at the University of California, San Diego, provided a guide for surviving what has become known as the “data deluge.” His top 10 tips?

1. Make a plan. Create an explicit strategy for stewardship and preservation for your data, from its inception to the end of its lifetime; explicitly consider what that lifetime may be.

2. Be aware of data costs and include them in your overall IT budget. Ensure that all costs are factored in, including hardware, software, expert support, and time. Determine whether it is more cost-effective to regenerate some of your information rather than preserve it over a long period.

3. Associate metadata with your data. Metadata is needed to be able to find and use your data immediately and for years to come. Identify relevant standards for data/metadata content and format, following them to ensure the data can be used by others.

4. Make multiple copies of valuable data. Store some of them off-site and in different systems.

5. Plan for the transition of digital data to new storage media ahead of time. Include budgetary planning for new storage and software technologies, file format migrations, and time. Migration must be an ongoing process. Migrate data to new technologies before your storage media becomes obsolete.

6. Plan for transitions in data stewardship. If the data will eventually be turned over to a formal repository, institution, or other custodial environment, ensure it meets the requirements of the new environment and that the new steward indeed agrees to take it on.

7. Determine the level of “trust” required when choosing how to archive data. Are the resources of the U.S. National Archives and Records Administration necessary or will Google do?

8. Tailor plans for preservation and access to the expected use. Gene-sequence data used daily by hundreds of thousands of researchers worldwide may need a different preservation and access infrastructure from, for example, digital photos viewed occasionally by family members.

9. Pay attention to security. Be aware of what you must do to maintain the integrity of your data.

10. Know the regulations. Know whether copyright, the Health Insurance Portability and Accountability Act of 1996, the Sarbanes-Oxley Act of 2002, the U.S. National Institutes of Health publishing expectations, or other policies and/or regulations are relevant to your data, ensuring your approach to stewardship and publication is compliant.

Like this article? Found it helpful? Bookmark Jrox Resources for more helpful articles, and visit Jrox.com to learn more about Affiliate Marketing and get access to your own Affiliate Software and eCommerce Shopping Cart.

Leave a Reply

Your email address will not be published. Required fields are marked *