Automated full-disk backup on Linux/Ubuntu
(Updated: )
Now that I’m storing my valuable smart home data () on Raspberry Pi I need a backup in case something goes wrong, most notably a power failure.
I settled on a full-disk rsnapshot incremental backup scheme, combined with explicit influxdb backup which seems to work nicely.
Goal ¶
On my Raspberry Pi I have the following assets to protect:
- Overall Raspbian config (which took a while to converge to)
- InfluxDB with data
- Grafana with dashboards
- Worker scripts
Goals:
- Protect against data corruption through power failure
- Bonus: protect against user error (keep version history)
- Bonus: protect against fire (offsite backup)
Alternatives considered ¶
Besides a backup, I briefly considered a UPS to protect against power outage. This would be nicer as I never need to check for corruption of the fs after actual power failure. However, this solution is either more complex (needs extra wiring/soldering), more expensive, unclear reliability (test once is no guarantee for it to work, while backups are to some extent), and most importantly: a UPS could actually increase risk of fire, nullifying my UPS-solution and causing worse problems
Ideas I had and decided not to use:
- Use a Powerbank as UPS (tutorials-raspberrypi.com): pro: easy, reliable; con: needs bank that can simultaneously charge/discharge (expensive), increase fire risk: affordable power banks are not designed to be used continuously, additional logic required to power down for longer outages
- Custom Rai solution (thepihut.com): pro: integrated solution, reliable? con: more complex, fire risk, additional logic required to power down for longer outages
Backup solution ¶
I settled on using rsnapshot. I tried dd before, as recommended here (raspberrypi.org) and there (stackexchange.com), but this did not work for me as I want to backup a live filesystem, and somehow dd does not like this.
For reference, this command did NOT work:
ionice -c 3 dd bs=4M if=/dev/mmcblk0p2 | gzip \
> /media/backup/$(date +%Y%m%d).raspbian.img.gz
Backup influxdb ¶
Since backing up a live database is prone to error (I don’t know what would happen when backing up a database which is being written to), I separately backup the influxdb using the following script which runs daily at midnight:
#!/usr/bin/env bash
/usr/bin/influxd backup -portable \
/home/pi/backup/influx_snapshot.db/$(date +%Y%m%d) && \
/home/pi/.local/bin/rotate-backups \
--daily 5 --weekly 4 --yearly "always" \
/home/pi/backup/influx_snapshot.db/
This makes a backup using influxdb’s own backup utility (influxdata.com), and rotates these using rotate-backups (pypi.org). Rotating has the advantage that I can roll back a few days back in case I accidentally delete my data.
Get USB stick ¶
Get reliable (and optionally slow thus cheap) USB stick (USB2.0 suffices for my ~3GB RPi3B+ installation). Some options:
- Transcend JetFlash 790 64GB Zwart, 15 EUR, 35 MB/s
- Transcend JetFlash 790 128GB Zwart, 21 EUR, 35 MB/s
- Intenso Speed Line 128GB Zwart, 18 EUR, 23 MB/s
- Sandisk Ultra 64 GB, 13 EUR, 22 MB/s
- Intenso Speed Line 256GB Zwart, 35 EUR, 89MB/s
- Kingston DataTraveler 100 G3 64GB Zwart, 8 EUR, 15 MB/s (collapses for random write)
Based on this hardware.info review (hardware.info).
Add to /etc/fstab to automount:
UUID=6AC72A3C-8CAA-445F-83CE-35FF5D76BD01 /media/backup ext4 noatime,noexec,nosuid 0 0
Configure & schedule rsnapshot ¶
As backup tool I use rsnapshot (rsnapshot.org) which has been around for a while and is built on robust rsync backend. I used the digital ocean guide (digitalocean.com) and the linuxconfig guide (linuxconfig.org)
I use the following config:
config_version 1.2
# Set backup target to usb stick
snapshot_root /media/backup/rsnapshot/
# don't create root directory because it's already there
no_create_root 1
cmd_cp /bin/cp
cmd_rm /bin/rm
cmd_rsync /usr/bin/rsync
cmd_logger /usr/bin/logger
cmd_du /usr/bin/du
# Keep 7 daily, 4 weekly and 4 monthly backups
retain daily 7
retain weekly 4
retain monthly 4
verbose 2
loglevel 3
lockfile /var/run/rsnapshot.pid
# Exclude
exclude /media/
exclude /dev/
exclude /mnt/
exclude /lost+found/
exclude /proc/
exclude /tmp/
# Add backup target
backup / rpi3b/
After setting up, test config and dry-run:
rsnapshot configtest
rsnapshot -t daily
And finally add to cron (as root):
# daily backup is ran at 01:20 am to include stuff happening at midnight
20 01 * * * /usr/bin/rsnapshot daily
# weekly backup is ran at 01:05 pm on Sunday, just before running the daily backup that week
05 01 * * 7 /usr/bin/rsnapshot weekly
Conclusion & next steps ¶
Using this backup scheme I largely cover two of the three goals:
- Protect against data corruption - works with max 1 day delay, which could be shortened by increasing backup frequency. Open risk: recovering crashed/corrupted system is fairly slow as I likely have to reconfigure stuff manually
- Bonus: protect against user error (keep version history) - covered nicely via incremental backups
- Bonus: protect against fire (offsite backup) - not covered, but if my home is gone home automation loss is OK
Some ideas to extend & improve:
- Automatic method to check for data corruption upon hard system crash
- Automatic method to check for invisible data corruption upon system crash after which system seems to work OK
- Backup to offsite host (rsync works over ssh)
- Run using ionice/nice to reduce backup load
- Automatically umount usb stick when not in use