• 0 Posts
  • 9 Comments
Joined 11 days ago
cake
Cake day: January 21st, 2025

help-circle



  • I just have them on a usb stick with a copy on the array as well so they can also be checked for bitrot. Even doing it for every file it’s not that much data and it’s scripted so it’s done pretty continuously (I do it weekly).

    Actual file backups are what I store off site. 2 copies, one here and one off. My data generally isn’t changed all that much so I don’t bother continually backing up most directories. Like it doesn’t make sense to have 30 backups of my tv folder with my shows. They’re the same shows. I have some redundancy, I don’t just do one and done, but tape media is expensive so I don’t do like monthly backups either. Tape is wildly impractical for most home users though and offsite with tape means you need a trusted place to put it that’s reasonably safe and of moderately decent climate/humidity. Though an advantage of tape is that basically no one but the biggest of tech dorks is going to be able to read that data (versus something like leaving an external hard drive or bluray at a friends house. Even if you trust them a LOT they might plug it in. Although encryption exists)

    It’s home data so it’s about balancing what makes sense with what’s cost effective and your risk tolerance

    Some data is crucial of course. My personal documents are backed up far more regularly, like once an hour or so, and that’s where I utilize services like back blaze. My business, which is healthcare oriented, is entirely different and that data is segregated and utilizes backblaze as well as specialized software since it handles PHI and hipaa concerns. That’s backed up pretty much every few minutes.


  • Bitrot sucks

    Zfs protects against this. It historically has been a pain to work with for home users but recently the implementation raidz expansion has made things a lot easier as you can now expand vdevs and increase the size of arrays without doubling the amount of disks.

    This is a potential great option for someone like you who is just starting out but still would require a minimum of 3 disks and the associated hardware. Sucks for people like me though who built arrays lonnnnng before zfs had this feature! It was literally up streamed like less than a year ago, good timing on your part (or maybe bad, maybe it doesn’t work well? I haven’t read much about it tbf but from the small amount I have read it seems to work fine. They worked on it for years)

    Btrfs is also an option for similar reasons as it has built in protections against bitrot. If you read on this there can be a lot of debate about whether it’s actually useful or dangerous. FWIW the consensus seems to be for single drives it’s fine. My array has a separate raid1 array of 2tb nvme drives, these are utilized as much higher speed cache/working storage for the services that run. Eg if a torrent downloads it goes to the nvme first as this storage is much easier to work with than the slow rotational drives that are even slower because they are in a massive array, then later the file is moved to the large array for storage in the middle of the night. Reading from the array is generally not an intensive operation but writing to it can be and a torrent that saturates my gigabit connection sometimes can’t keep up (or other operations that aren’t internet dependent like muxing or transcoding a video file). Anyway, this array has btrfs and has had 0 issues. That said I personally wouldn’t recommend it for raid5/6 and given the nature of this array I don’t care at all about the data on it

    My array has xfs. This doesn’t protect against bitrot. What you can do if you are in this scenario is what I do: once a week I run a plugin that checksums all new files and verifies checksums of old files. If checksums don’t match it warns me. I can then restore the invalid file from backup and investigate for issues (smart errors, bad sata cable, ecc problem with ram, etc). The upside of my xfs array is that I can expand it very easily and storage is maximized. I have 2 parity drives and at any point I can simply pop in another drive and extend the array to be bigger. This was not an option with zfs until about 9 months ago. This is a relatively “dangerous” setup but my array isn’t storing amazing critical data, it’s fully backed up despite that, and despite all of that it’s been going for 6+ years and has survived at least 3 drive failures

    That said my approach is inferior to btrfs and zfs because in this scenario they could revert to snapshot rather than needing to manually restore from backup. One day I will likely rebuild my array with zfs especially now that raidz expansion is complete. I was basically waiting for that

    As always double check everything I say. It is very possible someone will reply and tell me I’m stupid and wrong for several reasons. People can be very passionate about filesystems


  • Yeah I have a 15 drive array.

    You can raid 1 and that’s basically just keeping a constant copy of the drive. A lot of people don’t do this because they want to maximize storage space but if you only have a 2 drive array it’s probably your safest option

    it’s only when you get to 3 (2 drive array + parity) that you have some potential to maximize storage space. Note that here you’re still basically sacrificing the space of an entire drive but now you basically double it and this is more resilient overall because the data is spread out over multiple drives. But it costs more because obviously you need multiple drives

    Keep in mind none of these are back up solutions though. It’s true that when a drive dies in a raid array you can rebuild the data from other drives but it is also true that this operation is extremely stressful and can lead to death of the array. Eg in raid 1 a single drive dies and when adding a new drive the second drive that held the copy of your data starts having sector corruption during rebuild of the new drive, or in raid 2 one of the 3+ drives dies and when you rebuild from parity the parity drive dies for similar reasons. These drives are normally only being accessed occasionally and the rebuild operation is basically seeking to every sector on the drive if you have a lot of data, and often puts the drive under a lot of read operation for a very long period of time (like days) especially if you get very large modern drives (18,20,24tb)

    So either be okay with your data going “poof” or back up your data as well. When I got started I was okay with certain things going “poof”, like pirated media, and would backup essential documents to cloud providers. This was really the only feasible solution because my array is huge (about 200tb with about 100tb used). But now I have tape backup so I back everything up locally although I still back up critical documents to backblaze. Depends on your needs. I am very strict about not wanting to be integrated to google, apple, dropbox, etc. and my media collection is not simply stuff I can retorrent, it’s a lot of custom media I’ve put together the “best” version of to my taste. but to set something up like this either takes a hefty investment or if you’re like me years of trawling ewaste/recycling centers and decommission auctions (and it’s still pricey then but at least my data is on my server and not googles)




  • Do these companies put their fingers on the scale? Almost certainly

    But it’s exactly what he said that’s what brought us here. They have not particularly given a shit about politics (aside from no taxes and let me do whatever I want all the time). However, the algorithms will consistently reward engagement. Engagement doesn’t care about “good” or “bad”, it just cares about eyes on it, clicks, comments. And who wins that? Controversial bullshit. Joe Rogan getting elon to smoke weed. Someone talking about trans people playing sports. Etc

    This is a natural extension of human behavior. Human behavior occurs because of a function. I do x because of a function, function being achieving reinforcement. Attention, access to something, escaping, or automatic.

    Attention maintained behaviors are tricky because people are shitty at removing attention and attention is a powerful reinforcer. You tell everyone involved “this person feeds off of your attention, ignore them”. Everyone agrees. The problematic person pulls their bullshit and then someone goes “stop it”. They call it negative reinforcement (this is not negative reinforcement. it’s probably positive reinforcement. It’s maybe positive punishment, arguably, because it’s questionable how aversive it is).

    You get people to finally shut up and they still make eye contact, or non verbal gestures, or whatever. Attention is attention is attention. The problematic person continues to be reinforced and the behavior stays. You finally get everyone to truly ignore it and then someone new enters the mix who doesn’t get what’s going on.

    This is the complexity behind all of this. This is the complexity behind “don’t feed the trolls”. You can teach every single person on Lemmy or reddit or whoever to simply block a malicious user but tomorrow a dozen or more new and naive people will register who will fuck it all up

    The complexity behind the algorithms is similar. The algorithms aren’t people but they work in a similar way. If bad behavior is given attention the content is weighted and given more importance. The more we, as a society, can’t resist commenting, clicking, and sharing trump, rogan, peterson, transphobic, misogynist, racist, homophobic, etc content the more the algorithms will weight this as “meaningful”

    This of course doesn’t mean these companies are without fault. This is where content moderation comes into play. This is where the many studies that found social media lead to higher irritability, more passive aggressive behavior and lower empathetization could potentially have led us to regulate these monsters to do something to protect their users against the negative effects of their products

    If we survive and move forward in 100 years social media will likely be seen in the way we look at tobacco now. An absolutely dangerous thing that was absurd to allowed to exist in a completely unregulated state with 0 transparency as to its inner workings