One case where this might go wrong is if you accidentally delete a file, the deletion replicates and you're left without a backup. Replication is nice but it is not a replacement for backup.
I feel like a filesystem with native snapshot support (zfs, btrfs), replicated once (also native in zfs) obsoletes conventional 3-2-1 backup systems. It’s technically only 2 versions, but you’re protected from all the same failure modes.
To clarify all data that is transferred from Linux ZFS to FreeBSD based systems or visa versa is copied using restic or rsync.
I only use ZFS replication when doing Linux to Linux transfers and when that happens they are running the exact same operating system and version of OpenZFS.
Eh, especially if you're just using Linux and FreeBSD (doubly so now that they're both using openzfs) it's easy enough to keep pool features compatible. Obviously you need to either pin to a compatible feature level or avoid upgrading the pool, but I don't think it's terribly hard.
This was before freebsd used openzfs, maybe easier now. But my point still stands, you are off the beaten track at a time when you want to minimize risks.
Heh, you must really have foul look, I am using it on FreeBSD since 2007ish and never had any issues. Same pool, no longer the same disks as they were all replaced since. Also the hardware around it was replaced and FreeBSD 8 updated to all new versions, currently running 14.0.
Never lost any data. There isn't much software that could claim that.
Meanwhile I lost (I had backups) all my data twice on btrfs, I know it is more stable now, but I certainly wont ever use it again. Even HAMMER1 (I would love to use HAMMER2, but until my server dies, I will stay with FreeBSD) lost it only once and even in that case, after debugging irc session with Matt Dillon, I was able to recover most files.
The only thing that pisses me off is that Kubuntu doesnt support it trough installer (yes I could do it manually but been there with Fedora and I am sick of tracking if initfs was updated or finish with unbootable - easy solvable, but annoying - situation ) and I am now forced to use Ubuntu with KDE on workstation. But this is not zfs.
> The only thing that pisses me off is that Kubuntu doesnt support it trough installer ...
For my personal workstation I've started experimenting with using Proxmox (it's a Debian 12 variant) as the OS because its installer supports multi-drive ZFS installation (RAIDZ1/2/10/etc) out of the box. So, boot setup is currently mirrored SSDs (with /home on mirrored NVMe drives).
Apt installing the standard desktop stuff afterwards (Nvidia drivers, KDE desktop, etc) has worked well, and it all seems happy.
That being said, I'm only 4 or 5 days into this test setup. So far so good though. :)
It's quite possible to get ZFS into weird states without too much effort when you're screwing around with the underlying devices (adding, deleting, changing things).
This seems to crop up at really inconvenient times too, like when you're trying to do something during a scheduled outage. :(
That kind of thing aside though, it's been pretty solid in my use for actual data storage.
Just don't use ZFS's native encryption + ZFS snapshots + send/recv.
Reportedly that combination is a cause of data corruption:
Hmmm from what I understand zrepl can do copy of source to backup both in a push or pull mode. I would say the push mode is still a very fragile way to do backups.
Imagine source machine is compromised and attacker decide to delete/encrypt your data, and see there is a backup mechanism connecting to the backup machine, what prevents him from using the deleting/encrypting the backups as well?
You'd definitely want the backup machine to pull the snapshots, and have no way to connect to it from source machine directly with a user that would have access to the data or an admin account. That means no ssh keys on the source machine, no password kept in a password manager that would be loaded on the source machine either.
Another strong method would involve 3 machines:
source --push--> replica1 <--pull-- replica2
Where source and replica1 would have ZFS filesystem and snapshots while replica2 is using a different filesystem (LVM + ext4 ) and snapshots to safeguard from replicating bugs that lead to data not being available. ZFS snapshots could be saved as individual files on this filesystem.
> ... what prevents him from using the deleting/encrypting the backups as well?
With ZFS snapshots the older snapshots would still be present on the target server, in their unencrypted form.
> That means no ssh keys on the source machine ...
Typically for non-user logins (eg script access and similar) you do the extra step of configuring the receiving ssh to only allow a specific command for a given key.
It's a configurable ssh thing, where you add extra info to the .ssh/authorized_keys file on the destination server. With that approach, it doesn't allow general user logins while still allowing the source machine to send the data.
> With ZFS snapshots the older snapshots would still be present on the target server, in their unencrypted form.
Only and only if you apply best practicies mentionned above in the second point of your post.
Anyway I'd rather protect my backup as much as I can and not allow the source machine to have any direct access to an account on the backup server because of possible security. Security is hard sometimes and you never know when some bugs might increase the possibilities. I like my backup servers to not have any open port. The caveat is that it is only possible if your primary backup server is locally and physically accessible. If you are travelling you might want to be able to access it without being physically present. An option might be to just have ssh disabled when you are at home and you enable the service when you know you won't be at home for a long enough period to make it a problem if you need to restore data.
That doesn't make any sense. Native snapshots are fantastic, but are merely an effective way to do backups. One where you trade some complexity from the backup software to the filesystem.
(only talking backups, snapshots of course have utility beyond that usecase as well)
Well... The issue with encrypted zfs + raw send is that a pool encrypted with a common key for all volume became an individual key per volume, a non-RAW send means your target read your files. If you use a keyfile this is a non-issue. If you type your key, well, you import all the old volumes, create a new pool and send them re-encrypting them with a common key. Very raw but doable at home scale setups.
You don't actually need a dedicated ZFS backup program. A simple cron script will handle incremental backups just fine. If anyone is interested, the script we use to backup our multi-TB PostgreSQL database can be found here: https://lackofimagination.org/2022/04/our-experience-with-po...
Funny story -- when I was working on the xkcd Machine comic, I actually used the ZFS snapshots to rescue data. I accidentally blew away some early physics prototype code and fished it out of /.zfs/snapshot.
My face palm moment of this year was accidentally restoring a zfs snapshot of my root pool form a week ago, but it was actually a year and a week ago. Didn’t lose any of my data, but suddenly I had some offering to format their version mismatched databases.
If that's the case, then doing so with replicated ZFS snapshots is probably not a good idea.
That specific scenario (ZFS encryption -> replication of encrypted snapshots) is a known cause of ZFS corruption. :(
https://www.phoronix.com/news/OpenZFS-Encrypt-Corrupt
Unfortunately it doesn't seem to be widely known about, though there is a suggestion to make it official:
https://github.com/openzfs/openzfs-docs/issues/494