While reading many of the blogs and posts here about self hosting, I notice that self hosters spend a lot of time searching for and migrating between VPS or backup hosting. Being a cheapskate, I have a raspberry pi with a large disk attached and leave it at a relative's house. I'll rsync my backup drive to it nightly. The problem is when something happens, I have to walk them through a reboot or do troubleshooting over the phone or worse, wait until a holiday when we all meet.
What would a solution look like for a bunch of random tech nerds who happen to live near each other to cross host each other's offsite backups? How would you secure it, support it or make it resilient to bad actors? Do you think it could work? What are the drawbacks?
I have local incremental backups and rsync to the remote. Doesn't syncthing have incremental also? You have a good point about syncing a destroyed disk to your offsite backup. I know S3 has some sort of protection, but haven't played with it.
It's an established, stable, understood and very very thoroughly debugged and tested protocol/server solution that'll run on a potato and has clients for every OS you've ever heard of, and a bunch you haven't.
Setting up your own little mini-network and sharing groups is fairly trivial and it'll happily shove copies of everyone's data to every server that's on the feed.
Just encrypt your shit, post it, and let the software do the rest.
(I mean, if it's good enough to move 200TB of perfectly legitimate Linux ISOs a day, it'll handle however much data you could possibly be backing up.)
Disclaimer: it's not quite that simple, but I mean, it's pretty close to. Also I'm very much a UNIX boomer and am a big fan of the simplest solution that's got the longest tested history over shiny new shit, so just making that bias clear.
I've done a backup swap with friends a couple times. Security wasn't much of a worry since we connected to each other's boxes over ssh or wireguard or similar and used tools that allowed encryption. The biggest challenge for us was that in my selfhosting friend group we all prefer different protocols so we had to figure out what each of us wanted to use to connect and access filesystems and set that up. The second challenge was ensuring uptime and that the remote access we set up for each other stayed up - and that's what killed the project as we all eventually stopped maintaining the remote access and nobody seemed to care - so if I were to do it again I would make sure all participants have alerts monitoring their shared endpoint.
A lot of technical aspects here, but IMHO the biggest drawback is liability.
Do you offer free storage connected to internet to a group of "random tech nerds". Do you trust all of them to use it properly? Are you really sure that none of them will store and distribute illegal stuff with it? Do you know them in person so you can forward the police to them in case they came knocking at your door?
I attended some LUGs before covid and could see something like this being facilitated there. It also reminds me of the Reddit meetups that I never partook in.
I would propose creating a distributed hash table for this. But I would never host someone else's data like this, because I'm too afraid they will give me encrypted illegal content and then some obscure law will give me the fault for it. This is just me though.
Trunas with Tailscale/headscale/NetBird as far as software and security. As far as hardware, you want storage that is not attached via usb. Either an off the shelf nas solution or a diy nas would work. There are a few YouTubers that touched on this, hardware haven and raidowl I think.
Reliability of connection to the drives, especially during unscheduled power cycles. USB is known for random drops, or not picking the drive up before all your other services have started, and can cause the need for extra troubleshooting. Can run fine… or it could not. This is in reference to storage drives, not OS drives.
I have exactly the setup you described, a Raspberry Pi with an 8 TB SSD parked at a friend of mine. It connects to my network via Wireguard automatically and just sits there until one of my hosts running Duplicati starts to sync the encrypted backups to it.
It sucked when Crashplan's home client went under. If you installed the client on two computers with internet access, it would let you set the remote computer as a target. Encryption was done at the source, it had dedupe, versioning. It ate a little ram but it was really nice.
Yes. It's the "put a copy somewhere else" that I'm trying to solve for without a lot of cost and effort. So far, having a remote copy at a relative's is good for being off site and cost, but the amount of time to support it has been less than ideal since the Pi will sometimes become unresponsive for unknown reasons and getting the family member to reboot it "is too hard".
You could use kopia for this (but you would need to schedule cron jobs or something similar to do it).
The way this works with kopia... You configure your backups to a particular location, then in-between runs there's a sync command you can use to copy the backup repository to other locations.
Kopia also has the ability to check a repository for bad blobs via its verify function (so you can make sure the backups stored are actually at least X% viable).
Using zerotier or tailscale (for this probably tailscale because of the multithreading) would let you all create a virtual network between the devices that lets them directly talk to each other. That would allow you to use kopia's sync functionality with devices in their homes.