The Software Development Blog | AndPlus

An Integrated Backup Solution [Part I]

Written by Brian Geary | Jun 23, 2010 4:00:00 AM

When it comes to your company's data, nothing is more valuable, yet often it is taken for granted. Taken for granted, that is, until a hard disk fails. Many companies set up a simple policy to mirror their data on an external drive every night. This works great until the wrong file gets published, and then nobody knows which version should be on the server, and the backup already mirrored the incorrect one. I've even seen companies who just hope that "somebody has a copy" and ignores the idea of backups until something heavily breaks.

At AndPlus Design, we do things a little differently. While keeping developer's source code in source control is standard policy almost everywhere, servers are another matter entirely. This is a sad state of affairs, because there's really no reason not to. Disk space is a commodity, much cheaper than time, and reliable source control solutions like Subversion have no issues storing large numbers of large files. So, why not mirror a server's web directory in source control, a live version, along with all of the development versions. Even more commonly neglected than a server's file system are the server's data records. So why not back these up in source control too.

Now if a server's security is breached and data is modified, or if a user accidentally deletes their account, or even if a server suffers a fatal crash, we can look at previous versions of files and data, in exactly the same we would view changes between development versions of software. And of course, if something breaks on a website, we can pinpoint when it changed and what caused it.

Based upon the principles above, AndPlusDesign has an automated backup solution which can easily handle an arbitrary number of websites and servers. Every website's files and databases are stored in source control daily. Then, source control itself is backed up to multiple locations. This is all implemented as a small set of python scripts. Of course, nothing is a simple as it looks, and in Part II we will go over actual implementation strategies, gotchas, and solutions.