SCHRODINGER’S BACKUP: And Why You Should Upgrade Your 3–2–1 Backup Rule
If you have not heard of Schrödinger’s cat, It is a thought experiment devised by the Austrian physicist Erwin Schrödinger in 1935. As the experiment goes, if you seal a cat in a box with a vial of poison that can be opened anytime, you won’t know if the cat is alive or dead until you open the box. Thus, until you open the box, the cat is simultaneously dead and alive.
So, what do Cats and Backups have in common? For starters, they both seem to have a plan. Further they both possess a habit of knocking down our most dear belongings in the least of unexpected situations.
Backing up our data is a critical part of any organization’s Disaster Recovery plan! Why? Because Data is precious!! The devastation of watching hours of work disappear before our eyes as our computer crashes the power goes out, or due to a number of myriad reasons is not something that we look forward to. Backups provide a safety net in the face of unexpected data loss. Justifying the need for data backup within any enterprise today should be a simple task. However, determining an organization’s best backup strategy is not as easy. There are various, software, hardware, and cloud options to choose from, combined with a number of suggested policies and procedures. One of the most popular data backup strategies has originated from a creative professional—Peter Krogh, a photographer, rather than from someone working in the Information Technology field or a standards organization, as one would have expected. The Backup rule that Peter Krogh quoted is known as 3–2–1 Backup Rule. Accordingly, this rule should satisfy the following requirements:
• 3 Copies of Data: Maintain three copies of data; the original, and at least two copies.
• 2 Different Media: Use two different media types for storage; This helps to fight off any impacts that can be attributed to a specific type of storage media.
• 1 Copy Offsite: Keep one copy offsite; This prevents the possibility of data loss due to a site-specific failure.
The 3–2–1 Backup rule is a revered and time-honored backup strategy. And it is a rule to live by. But is it enough?!
Let us revisit the infamous incident at Pixar Studios. Back in 1998, when Pixar was nearly a year into releasing Toy Story 2 the disaster struck. One of the film’s animators, while routinely clearing out files, entered the deletion command rm -rf * at the root directory of Toy Story 2’s project on Pixar’s internal servers. The team started to notice as character models started disappearing from their works in progress. They pulled the plug on file servers but realized that 90% of the work from the last two years had been lost.
The team was quick to react to this situation by bringing in their tape backups. But back in 1998, when they were using tapes as the backup option, it had an upper limit of 4 GB. Unfortunately, the movie project had grown to over 10 GB in size, and the error log was also saved at the end of the tape, rendering all backups useless. They only realized this when they actually attempted to restore the data. Luckily for us all, the movie’s technical director, Galyn Susman was able to save the day. She had been working from home, following the recent birth of her child, and thus had a backup copy of the film on her home computer. The personnel were able to carry her computer into the office, where the team successfully recovered a two-week-old backup with almost all the original data, allowing them to resume working and deliver the finished film on schedule.
The Pixar team was able to recover nearly all of the lost assets save for a few recent days of work, allowing the film to proceed. If it had not been for Galyn Susman’s baby boy Eli, we would never have this version of Toy Story 2. However in reality, the offsite backup saved us the story.
This near disaster that the Pixar Animation Studios had to encounter shows us the importance of backups, especially for critical data, and most importantly verify the authenticity of backed-up data. Without that, the loss of data can be catastrophic.
Invoking Schrödinger’s cat again, let us extrapolate this as follows:
Schrodinger’s Backup: The condition of any backup is unknown until a restore is attempted.
Almost all people and organizations are running their own Schrödinger’s Backup experiment. They configure the backups like they guess it will work; see a few times the backups running without error and think everything will run smoothly for the rest of the time. When disaster strikes, they try to restore the data they backed up and realize to horror that the data does not restore like they thought it would.
So, let us bring in some redundancy into the 3–2–1 Backup rule that we are commonly following…
This takes the 3–2–1 Backup rule as a starting point. And adds two other necessary conditions to ensure recovery from any type of incident. Thus, we get the new and upgraded rule. This new 3–2–1–1–0 Backup rule should satisfy the following requirements:
• 3 Copies of Data: Maintain three copies of data; the original, and at least two copies.
• 2 Different Media: Use two different media types for storage; This helps to fight off any impacts that can be attributed to a specific type of storage media.
• 1 Copy Offsite — Keep one copy offsite; This prevents the possibility of data loss due to a site-specific failure.
• 1 Copy being offline, immutable, or air-gapped –
- 0 Errors: Verify that the backed-up data has no errors
These two additions are critically important for any backup scenario. Having a copy of backup data that is either offline, immutable, or air-gapped is an incredibly resilient feature to help ensure data recovery in case of a ransomware event. Having 0 errors upon backing up the data is something that we should start with. Once the backup process completes, the authenticity of backed-up data should be verified. If your backup method or application does not support this potentially lifesaving feature, it is time to switch that backup method or application. Thus, we see that even a single modification to your backup strategy can make all the difference in the world. Remember to successfully back up your data, regularly test your backups, and have a strategy ready to restore your data and/or infrastructure in a predefined, timely manner. If you just back up your data, hoping you will be able to recover in time, you are betting against Murphy.