DIY Encrypted Cloud Backup Using Raspberry Pi

Ranganathan Sankaralingam
9 min readJan 20, 2016

--

The floods in Chennai washed away many people’s books and photos. Those stories finally motivated me to get serious about an off-site digital backup solution.

I keep two copies of my photos on two external hard disks, manually syncing them every few weeks. However, since both disks are in the same place (my house), I’m not fully covered in the event of a natural disaster.

What I need is another copy of my data far away from house to protect against flooding etc. Also known as geographic redundancy or off-site backup.

One way to solve this problem is to outsource it to some company. But what’s the fun in that?

Overview

I use a pair of geographically distant Raspberry Pi’s, one at my house, and one at my brother’s house. I’ll call the Pi at my house the on-site or local one and the one at my brother’s house the off-site or remote. Each has a USB external drive attached to it. I use Bittorrent Sync to keep these disks synchronized. That’s the gist.

There are a few nice things my setup has on top of the basics. Such as:

  • Privacy: only encrypted versions of my files are present on the off-site Pi.
  • Expandability: you can easily add another disk down the road to store more files.
  • Safety: Bittorrent Sync keeps revisions of old files, so you have some protection accidental deletion. To make recovery from such errors easier, I wrote BTSync Rewind, an open-source software tool that presents a Time Machine-like point-in-time snapshot view of your Bittorrent Sync repository.

Those with some Linux experience should be able to replicate my setup with the instructions below. I gloss over the standard steps such as installing the Raspberry Pi, setting up ssh remote access, and getting Bittorrent Sync going.

Disclaimer: I tried everything, and made every effort to ensure that the commands and advice below is correct. I provide the reasoning behind each step so that you can make an informed decision. But ultimately you read and follow the procedures below at your own risk. I am not responsible for any losses you suffer by following these instructions.

My Hardware and Software

Watch the Watts aka Wiring for Reliability

The first-generation Raspberry Pi that I have can’t supply a lot of power to its USB ports. It freezes when I attach the wifi adapter or external drive directly.

I attach the wifi adapter and hard disk to a powered USB hub that I connect to the Pi. I use one port from the hub to serve as the Pi’s power supply.

Connecting the Wifi adapter and hard disk to the same USB hub lowers performance disk and wifi performance, but it’s fast enough for me and has been super-stable.

The Raspberry Pi B+ and Pi 2 can supply significantly more power. If you have those, can avoid the powered hub. Make sure you get a nice big power supply like the iPad one (2A or more at 5V). Good explanation of the improved power supply and discussion on maxing out the ports.

Installation

Raspbian

I followed the stock instructions: https://www.raspberrypi.org/downloads/raspbian/

Also install the following packages. We choose the simpler ‘msmtp-mta’ package rather than the default ‘postfix’ that Raspbian chooses.

sudo apt-get install mdadm lvm2 encfs udisks msmtp-mta

Enabling Wifi

This was problematic in the past but all drivers are built-in now, so it should be breeze: http://www.savagehomeautomation.com/projects/raspberry-pi-installing-the-edimax-ew-7811un-usb-wifi-adapte.html

Installing Bittorrent Sync

Go to https://www.getsync.com/platforms/desktop and get the ARM version. Install this on your Raspberry Pi. Try syncing a few files to make sure it’s working first.

Depending upon your home firewall setup, you may need to open up the TCP port at one of the ends. Looks under preferences in the web UI.

Update /etc/rc.local to start Bittorrent Sync on startup. Remember to run at user ‘pi’ (prefix the command with ‘sudo -u pi’).

Creating Expandable Disk Architecture

Next we’ll lay the foundations of a highly-expandable disk architecture. We’ll start with just one disk, but we’ll be able to easily add more disks, and easily replace the disk when it starts wearing out. I recommend doing this.

We’ll use Linux LVM (Logical Volume Manger) to allow us add disks later. We’ll use the RAID mirroring system in a slightly unconventional way to make it easy to replace disks.

Linux RAID mirroring has the cool ability to create a create a mirrored RAID array with some disks “missing” at creation time. Such an array starts out as a “degraded” array. Although it sounds useless, this will come in handy later.

We create a mirrored RAID array consisting of that disk and a “missing” disk. We’ll then format the RAID array as an LVM “physical volume”. Multiple physical volumes can be tied together into a “volume group” that is as big as the total space in all the physical volumes it contains. Finally, we can assign space, called “logical volumes”, from volume groups and create file systems on logical volumes.

Creating RAID Array

RAID can work on raw disks without any partitioning, so we won’t bother making partitions, saving ourselves the hassles GPT vs. MBR, large partitions, primary vs. secondary etc. We’ll assume that an external disk exists at address /dev/sda in our examples.

Notes: You need to type the word “missing” exactly as given in the command below. ‘sudo lsblk’ is your friend if you want to confirm device address.

sudo mdadm --create --verbose /dev/md0 --level=mirror --raid-devices=2 /dev/sda missing

You’ll get a question about whether you really want to use the entire disk instead of a partition. Answer yes.

mdadm: Note: this array has metadata at the start and
may not be suitable as a boot device. If you plan to
store ‘/boot’ on this device please ensure that
your boot-loader understands md/v1.x metadata, or use
--metadata=0.90
mdadm: size set to <somenumber>
Continue creating array? y <---- This is your answer.
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started <---- This is what you want to see.

Setting up LVM

These are the four commands:

sudo pvcreate /dev/md0
sudo vgcreate datavg /dev/md0
sudo lvcreate --extents '+100%FREE' --name datalv datavg
sudo mkfs.ext4 -L datalv -m 0 /dev/datavg/datalv

Here’s what the commands do. First, formatting /dev/md0 as a LVM physical volume:

sudo pvcreate /dev/md0

You should see:

Physical volume “/dev/md0” successfully created

Setting up a trivial volume group named “datavg” with only the /dev/md0 physical volume:

sudo vgcreate datavg /dev/md0

You should see:

/proc/devices: No entry for device-mapper found
Volume group “datavg” successfully created

Now create a logical volume named ‘datalv’ to which we give all the space in the volume group:

sudo lvcreate --extents '+100%FREE' --name datalv datavg

Creating a file system to store files. We set a label (-L option) so that this disk can be automatically mounted at a fixed location, /media/datalv.

sudo mkfs.ext4 -L datalv -m 0 /dev/datavg/datalv

You should see something like:

mke2fs 1.42.12 (29-Aug-2014)
Discarding device blocks: done
Creating filesystem with nnnnnn 1k blocks and kkkkkkk inodes
Filesystem UUID: xxxxxxxx-xxxx-xxxx–dddd-aaabbbcccddd
Superblock backups stored on blocks:
<some numbers>
Allocating group tables: done
Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

Mounting LVM Logical Volume on Startup (Recommended Way)

Add the following file to /etc/rc.local before the line that starts Bittorrent Sync. The chown command sets the default Raspbian user as the owner of the data disk.

sudo udisks --mount /dev/datavg/datalv && chown pi:pi /media/datalv

Now the logical volume will show up at /media/datalv on startup.

Mounting LVM Logical Volume on Startup (Old School Way)

Create the folder where the file system we created above will appear:

sudo mkdir /media/datalv

Add the following line to the file /etc/fstab (the number of spaces between each word doesn’t matter. I like to use 1 space). The ‘nobootwait’ option make sure the Pi continues booting even if the disk array can’t be activated. Without this, it will endlessly wait at a terminal prompt asking what to do, preventing networking from starting. You’ll need networking to debug this remotely.

/dev/datavg/datalv /media/datalv ext4 defaults,nobootwait 0 0

Add this line to /etc/rc.local somewhere near the top to give ownership of the disk to default Raspbian user:

chown pi:pi /media/datalv

Creating Sync Folder

Finally let’s create a folder within /media/datalv where we’ll put the files to be sync’d:

sudo mount /media/datalv
sudo mkdir /media/datalv/btsync-src
sudo chown pi:pi /media/datalv/btsync-src

Reboot to make sure everything works:

sudo shutdown -r now

Adding a Second Disk (Future Reference)

You’ve leveled up and started shooting 4K video in 3D. In just two months, you’ve almost filled up your first disk. Congratulations! Time to add another disk.

First attach the second disk to the powered USB hub. Doesn’t matter what the capacity is, it’s your choice. The basic idea is to create a second RAID array of two disks, with one disk “missing”. Then format it as a LVM physical volume. Then add it to the volume group. Finally, resize the logical volume and expand the file system.

If your second disk was assigned the device address /dev/sdb.

sudo mdadm --create --verbose /dev/md1 --level=mirror --raid-devices=2 /dev/sdb missing
sudo vgextend datavg /dev/md1
sudo lvresize --resizefs --extents '+100%FREE' datavg/datalv

Replacing a Disk (Future Reference)

It’s been three years and everything has been working well so far. However, you received an email alert from your Raspberry Pi that one of the disks, /dev/sda, is nearing its end of life according to SMART data. Time to replace it (coming next: article for how to set up alerting). The replacement drive needs to be at least the size of the failing disk.

This is when the the fake RAID array is going to pay off. And oh, drives have gotten bigger and cheaper, so you going to replace the 1TB drive with a 4TB one. Don’t worry, you won’t be stuck with 3TB of wasted space. :)

Note: Irritatingly, although you created RAID device named /dev/md0, it might been renamed to something else after a reboot. There are ways to force the name to persist, but I personally like to force as few settings as possible. Use ‘cat /proc/mdstat | grep sda’ to find out which RAID array contains the failing drive, /dev/sda. Let’s say the new name is /dev/md126.

Now plug in the new drive into the powered USB hub. Use ‘sudo lsblk’ to find out what the drive’s device address is. We’ll assume it got /dev/sdc. With this drive too, we’re not going to bother with partitioning the drive etc.

Add the new drive to the array containing the failing drive:

# you will change the device addresses.
sudo mdadm --add /dev/md126 /dev/sdc

You should see a brief message:

mdadm: added /dev/sdc

The Pi will immediately start copying data to the new drive. Doing ‘cat /proc/mdstat’ will show the progress:

$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md126 : active raid1 sdc[2] sda[0]
8384448 blocks super 1.2 [2/1] [U_]
[==>………………] recovery = 14.3% (1200000/8384448) finish=0.4min speed=240000K/sec

Wait for recovery to finish. It may take a few hours. Time to unplug the failing drive from the hub. Mark the drive as failing and then tell the Pi ignore it:

sudo mdadm --fail /dev/md126 /dev/sda
sudo mdadm --remove /dev/md126 /dev/sda
sync

After both commands finish, you can unplug the failing drive from the USB hub. Double check that you’re pulling out the correct disk. It’s highly likely you’ll suffer extreme data loss if you pull the wrong disk (the ‘sync’ at the end tries to reduce damage if you pull the wrong disk). If possible, power down the Pi to ensure you don’t unplug a drive that’s being used.

You’re still only using 1TB of the 4TB on the new disk. Time to use all the space on the drive.

First we expand the RAID array to use the entire disk, and then expand the LVM physical volume to fill the RAID array. Finally, we resize the logical volume and file system to use the newly created free space (same command that we used when we added a new disk above).

sudo mdadm --grow /dev/md126 --size=max
sudo pvresize /dev/md126 # automatically detects size
sudo lvresize --resizefs --extents '+100%FREE' datavg/datalv

All done.

Coming next: Setting up encryption so that the remote side only has encrypted data.

--

--