Help with TrueNAS Replication mistake

As the title suggests I have made a (probably critical) mistake.

I am pretty new to TrueNAS, I set up a TrueNAS server a while back and did the minimum needed to get it working and dump data onto it as I desperately needed the space.

I recently built a new server with new drives and did a better job, but not much better as I am learning as I am doing and after some reading / video watching, I decided to do a “replication” with the final load of data on Server A into a Dataset on Server B with existing data in it thinking that it will replicate the data in Server A and merge it with the data in the dataset in Server B much as it would via copy/paste in Windows.

Now what has happened is that I have NO data in the Dataset on Server B.

Ignoring for a moment why there is no data from server A in this Dataset on Server B, that is irrelevant as it is still on Server A.

What is important is whether I can get back the data that was just minutes ago in the Dataset on Server B.

Many thanks.

Addendum, I am not so new at this as to keep on writing data to Server B and I have turned off the “auto” on the replication task so as to not make the situation worse and otherwise touched nothing.

Not sure what happened here, but is the dataset on server B still there? How about the ZFS pool? If yes, then did you have a snapshot before copying stuff over? If not, then your pool’s toast.

I suppose you did some kind of zfs-send of the 1st dataset to the dataset on server B, without knowing that this replaces the data (you could have just sent the data to a non-existing dataset on server B and it would have been created there). ZFS is a file system, not a file manager. Also, replication means copying data from source A and sending to target B and deleting anything mismatching or absent from source A that is present in target B.

I will keep telling people to set up a small backup server, like an odroid hc4 and something like restic or even bare snapshots with rsync on top. If you make a mistake, you have backups in a place that are not supposed to be affected by other stuff going on in other places. Other accidents can happen, but with just something like restic, it’s unlikely.

Yes, yes, and yes, although on that last point the dataset on Server B “had” a snapshot as it refused to allow me to setup a replication job, BUT when it came to running the replication job it asked me if it was to (I don’t remember the exact wording), “replace” the snapshot with the one from Server A.!

If there is no usable snapshot for this Dataset on Server B, is there a data recovery option.?

That occurred to me after this happened, and you have confirmed it, thanks.

I made that classic mistake of making an assumption, and that was that it would replicate the contents and ADD it to the existing Dataset, but it seems that this is essentially a backup tool that backs up an entire dataset and erases wherever it is sent, much like a tape drive.

I will end up with 2 mass storage servers and one or two “mini-servers”, both in question here are mass storage servers and I just seemingly accidentally nuked a few terabytes of data, whilst trying to sort out the poorly setup first TrueNAS server. I have a lot of the nuked stuff in backup drives, spread across 9-10 drives and a second ZFS RAID array on Server B.

This was the first major step to setup everything correctly, setup better backups and reorganise my data which is a mess, a large part of that mess is because I setup TrueNAS poorly on Server A, and also on the other ZFS array on Server B that is not part of this discussion/problem.

Thanks very much for your reply.

If zfs list -t snapshot doesn’t show any snapshot you recall taking, then the dataset on server B is toast. Can’t do anything about it (particularly if you allowed the transfer to finish). There might be a recovery tool for zfs files somewhere, but I kinda doubt it.

That’s syncing.

Not sure what was wrong with server A, but I’d be surprised if anything was actually wrong on server A. TrueNAS is a system that kinda protects you from messing up (for better or worse). I assume it was something with the pool configuration or something, idk.

Hope this is not unrecoverable.

I turned the server back on and will do this now.

I tried Rsync first but the performance was a joke so I looked for alternatives and I found this.

The pool configuration on Server A is 6x HDD’s in ZFS-2 with a couple of write cache disks. The cache disks seem to have been redundant so far, but my network speed up until now could never overwhelm the RAM utilization, they were spare drives anyway so I just set it up, but after the rest of the array, perhaps this info will help help you or others figure out what went wrong.

Likewise, but if it is gone I will first look at my various backup drives to see if I had that data, if I didn’t then I already considered it to be unimportant, and the name of the Dataset should give away the contents “To be sorted”. Some good, some junk, some duplicates of data elsewhere besides backups, and none of it important enough to have already have been sorted, but remembering exactly what was there is the tricky part.

Again, thanks for your help.

I assume the Dataset is “toast”. I don’t know how to read this but I assume that the “Snapshots” that the command you showed me should have a quantity of B (Bytes.?), should not have a - under “Availability”, and the others have a . after the number under “RE”.

I will look for a ZFS recovery tool, as much as anything else I just want the names of the top level directories to find on my backup drives as the contents will all be the same.

Following on from that, back in the days of backups that used multiple floppy disks, Tapes etc. If data was spread over multiple physical locations, the software would tell you where it was when you looked for a certain file/folder.

Is there some simple software that is not part of a piece of backup software that keeps track of what data is where across multiple external drives.?

In the past, (before things got so disorganised) I used to manually keep a txt file of what folders were where for data on PC’s/Servers and USB drives, obviously this was a bit of a PITA and an automatic option would be great to have. Any ideas of software, or even what to search for as many of the terms are different from what I am expecting. Thanks.

# zfs list zroot/ROOTFS/home
NAME                USED  AVAIL  REFER  MOUNTPOINT
zroot/ROOTFS/home   111G  72.9G  92.3G  /home

# zfs list -t snapshot
NAME                         USED  AVAIL  REFER  MOUNTPOINT
zroot/ROOTFS/home@20231029  18.6G      -   103G  -
[oddmin@oh3p ~]$ 

Available space doesn’t matter, because it’s not a mountpoint, it only shows you how much space is being utilized (“data saved / retained”) by the snapshot. In the zfs list of the mountpoint, you see it uses 111 GB on the pool and has 72 GB available to write on the pool.

The used space includes all the snapshots it has. The refer column is the actual amount of data in the mountpoint (i.e. if I did an rsync of /home to somewhere else, I’d be copying ~92.3G). The snapshot uses 18.6G and there’s only 1, so if you substract 18.6 from 111, you get about 92.4G, almost exactly what ZFS is reporting (this is just a roundup from bytes).

The fact that you see the snapshot is good. ZFS works by having the snapshots always mounted in the root of the volume you take a snapshot of.

For example, I can browse all the data from home@20231029 snapshot by doing ls -l /home/.zfs/snapshot/. You will not see the “.zfs” folder in an ls -la /home, because it is hidden from everything, unless you specifically type it manually. So, if you ls -lah /home/.zfs, it will work, even if there is “no folder” in the root of your dataset.

And I can simply ls -alh /home/.zfs/snapshot/20231029/ and see the contents of /home from the snapshot.

Highly doubt it. What you’re looking for is an archival tool, which are not technically a backup, but are considered as such by backup tools (particularly if you archive the data to a slower-tier storage and a tertiary location, like off-site tape).

I was looking for alternatives for moving my less frequently accessed data to large HDDs, like doing backups of the same data to 2x bluray discs and indexing the content somewhere centralized to easily search the discs that I need and to have an index of the discs on the discs themselves, in case I lose access to the centralized index (so I don’t have to scan the whole content of the bluerays if something goes wrong).

I haven’t found such a software solution. I was thinking of rolling my own archival scripts, but I did a cost analysis of blurays and IIRC it seemed to be on-par with HDDs for more amount of work and if I don’t fill the discs to the max with data, the cost per GB will be way higher for bluray than HDDs. So I’m stuck with moving my data to my NAS and I have a backup server that I just turn on whenever I want to run backups.

Apologies for not getting back to you, I promise that I will actually do so, but I have other things to do (like find the nuked data on backup drives), life, happenings, etc.