How I backup my Home System

Anyone who has been involved with computing and digital storage recognizes the value of backup.  Even with much improved MTBF of traditional HDD and the advent of SSD - the reality is that drives, controllers, etc can fail and the potential loss of data can be devastating.  No one wants to loose all of their kids photos or their music and video collection.

The challenge has always been - which is the best way to backup and what should I backup ?

While there are all kinds of methods and software that can provide data protection and backup services.  I've taken the following route.

I use a two drive approach to my primary storage.  First is an SSD acting as the OS drive.  In my case, I multi-boot that drive to Windows 7, 8.1 and Ubuntu Linux.   Second is my data drive where I keep all my relevant content. This includes all Documents, Photos, Music, Videos, etc.  It is very easy in Windows to point your profile to a different drive for the default and any specialized folder you have created.  

Over the years I have found this approach to provide two key attributes.  First it gives you great freedom to re-partition, reformat or replace the OS disk without it effecting your data.   Second it reduces overall IO work that your data drive has to do.  While the OS drive has to maintain things like swap files and indexes, etc - your only accessing your data drive when you actually need to get to some particular content. 

Next a word about RAID.  While this technology has been around for years, I have never found it to be 100% infallible when it comes to protecting data.  I typically use RAID to simply create large volumes by striping disks - so RAID-0.  So for example before 3 and 4 TB disks were readily available and cheap - I would stripe multiple 1 or 2 TB drives to get the same result.   

The challenge I have always found is that when you add some level of protection - from RAID-1 through RAID-5 - when you really need it  - it doesn't work and so it is a waste of your total storage.   Usually the event that caused the RAID failure either A) effects multiple disks - thereby killing the protection scheme or B) effects the controller or disk enclosure causing a non-recoverable state.

Bottom line - simplify as much as you can to meet your storage requirements. With the advent of larger 3 and 4 TB drives - you can store a massive amount of content on a single spindle, a simple RAID-0 stripe or one of the new OS Services like Windows Storage Spaces. 

Okay - so now that I've covered my approach to basic storage, let's discuss backup. 

Over the years, I've used OS based tools like Windows Backup as well as various 3rd party tools from Symantec and others.   In my opinion they are all a waste of time and have similar issues to the RAID discussion - when you really need it - you won't be able to recover.

I never backup my OS drive.   Since I never store any real data there I don't care if it fails.  Sure I have to rebuild from scratch if the disk fails, but I often find that is much faster and easier to rebuild from scratch than trying to boot from a "recovery" disk - attach to a backup store, and try to recover the OS content from there.  I maintain a library of OS and application disks and their associate license keys and so if for some reason my OS disk dies - I just replace the drive, insert the OS DVD and go.   That is my backup for OS.   

On the data side - I've never liked the approach of many "backup" applications that collect info about the directories, scan and then write the data to a completely different storage format and catalog it in a database.  What I want is effectively a mirror of my data drive that is immediately accessible and portable.  I also want something that is fast - many backup applications are horrifically slow.  

Even though backup vendors say - we only copy the changed sectors of a file, etc - anyone who has looked at a backup drive used by one of these applications immediately noticed the massive number of cryptically named "backup" files - so guess what - you can't access your data without the recovery app and oh if you you also need to recover the OS drive - you have to make sure that your recovery app is at the right version and patch level to read that backup.   Basically it's a fools errand.

I use a very simple tool.  It's called RoboCopy.  This tool was originally created by MS for internal use of their OS and Premier Support folks to make exact copies of a volume and directory structure.  For many years it was not included with Windows and you had to either A) request it from MS B) Buy the Resource Kit for your OS or C) have a TechNet or MSDN account to get it.   Today that is not the case - it's in the OS distribution and in my opinion is one of the hidden gems.

Robopy is a very powerful, folder level data copy tool.  It is fast - it will not just copy but also delete files you have removed from the source directory and will skip files already on the target.

So for my system - I use an external USB 3.0 3TB drive for backup.  I then use the following RoboCopy command

Robocopy SourceDrive:  TargetDrive: /mir /r:2 /w:10  

Where the SourceDrive is the drive letter of my primary data drive and the TargetDrive is the drive letter of my backup.  

The /mir option means "mirror" - so produce an exact copy of what is in the source directory to the target.  As I mentioned if I delete or move a file from a directory it will be reflected on the target the next time I run robocopy.  It will also not copy files it knows are already in the target - so it is extremely fast.  

Robocopy does have one limitation - it will copy an "open" file.   But that is often the same case as with many backup applications.   Robocopy will retry the copy and wait x seconds before trying again.   That is what the /r:2 and /w:10 options are for.  In this case they state try twice and wait 10 seconds before trying again.

What I love about this approach is I basically end up with an exact mirror of my data drive on my backup and it is immediately available.  It is also portable - I can disconnect the external drive - take it to another computer and there is my content - all of it - immediately available and ready to go.

The next step is automating the process.  Again I use the capabilities in the Windows OS to do this.  If you open the Computer Management Application you will see in System Tools an item called Task Scheduler.  This has been around for years and MS and 3rd party companies use this capability to schedule various background jobs.  

You can simply open the Scheduler - click on new task and create a "backup" task.   Task Scheduler gives you a number of options - such as running Robocopy under a special set of credentials, identifying "triggers" to start the job, the "action" or robocpy command itself and then of course the the schedule.  

For me it's everyday @ 3AM.   As with any backup job, the initial population of your target drive will take a bit - but it is still faster than many backup applications.  And since Robocopy will only copy the changed files - subsequent "copies" can be very fast.  My "delta" backups often finish is less than 2 minutes and I have 3TB stored.  Also - if you do a large reorg of your primary data drive - you can simply open a command prompt and run the command or you can go into Task Scheduler and say "Run now" and know you will have an exact copy of that reorg on your target

What is also nice is because you end up with an exact copy of your data drive you can easily see if your backup worked.  By opening your Computer folder you will see your data drive and backup and if the free space is exactly the same - it's working.

You can also use this technique to backup across the network.  You can "map" your target drive to another machine and robocopy will copy to that target.

So there are some considerations to this approach.  First there is no compression on the target - so you must size your backup to match your primary storage.  In my case I have a 3 TB data drive and a 3 TB backup.  So it can get more complex if you have very large primary stores.  Second there is no "version" control or point in time copies.  So if you mistakenly erase a file from your source your backup drive will only have it until the next robcopy job runs.   So if you use backup as erase protection - please keep this in mind.

At the end of the day, I have found that by separating OS from Data and keeping your disk storage as simple as possible - you will reduce your exposure and increase your flexibility and recovery associated with disk failures.   By then adding a simple backup approach like robocopy - you then have piece of mind of having an exact copy of your data at any time. 


Comments

Popular posts from this blog

Solar Storage - 2023 Update

Journey to Solar - Part 1 - Understanding your usage and getting skinny

ASUS RT-AC68U Router & WDS - a nice solution for a large home.