Should I bother with raid

Dust0741@lemmy.world · 2 years ago

Should I bother with raid

sugar_in_your_tea@sh.itjust.works · 2 years ago

Read perf would be the same or better if you didn’t add redundancy

RAID 1 can absolutely be faster than a single disk for read perf, and on Linux it is tuned to be faster. It’s not why you’d use it, but it is a feature of RAID. Intuitively, since both disks have exactly the same data, each disk could read different things. Likewise, for writes, you don’t have to write at the same time, as long as they’re always correct (e.g. don’t flip the metadata segment until both have written the data), so you can even get a write boost.

If performance is all you care about, then yeah, go ahead and use RAID 0. But you do get a performance boost with mirroring as well.

Yes, a backup should be tested, but it shouldn’t be relied on. Internet can go down, services can have maintenance, etc, so it’s a lot better to never need it. If you can afford a mirror, it’s having.

Atemu@lemmy.ml · 2 years ago

RAID 1 can absolutely be faster than a single disk for read perf, and on Linux it is tuned to be faster.

You’re missing the point entirely. I never said to use a single disk, I explicitly compared it to RAID0.

As far as data security is concerned, JBOD/linear combination and RAID0 are the same, so you’d obviously use RAID0 if you didn’t need redundancy.

sugar_in_your_tea@sh.itjust.works · 2 years ago

No, JBOD is not the same as RAID0. With RAID0, you always need the disks in sync because reads need to alternate. With JBOD, as long as your reads are distributed, only one disk at a time needs to be active for a given read and you can benefit from simultaneous reads on different disks. RAID0 will probably give the biggest speedup in a single user scenario, whereas I’d expect JBOD to potentially outperform in a multiuser scenario assuming your OS and filesystem is tuned for it.

RAID0 is pretty much never the solution, and I’d much rather have JBOD than RAID0 in almost every scenario.

RAID1 gives you redundancy while preserving the ability for disks to independently seek, so on competent systems (e.g. Linux and BSD), you’ll get a performance speedup over a single disk and get something that rivals RAID0 in practice. You wouldn’t use it for performance because JBOD is probably just as fast in practice without the storage overhead penalty (again, assuming you properly distribute reads across disks), but you do get some performance benefits, which is nice.

Atemu@lemmy.ml · 2 years ago

JBOD is not the same as RAID0

As far as data security is concerned, JBOD/linear combination and RAID0 are the same

With RAID0, you always need the disks in sync because reads need to alternate. With JBOD, as long as your reads are distributed, only one disk at a time needs to be active for a given read and you can benefit from simultaneous reads on different disks

RAID0 will always have the performance characteristics of the slowest disk times the stripe width.

JBOD will have performance depending on the disk currently used. With sufficient load, it could theoretically max out all disks at once but that’s extremely unlikely and, with that kind of load, you’d necessarily have a queue so deep that latency shoots to the moon; resulting in an unusable system.
Most importantly of all however is that you cannot control which device is used. This means you cannot rely on getting better perf than the slowest device because, with any IO operation, you might just hit the slowest device instead of the more performant drives and there’s no way to predict which you’ll get.
It goes further too because any given application is unlikely to have a workload that even distributes over all disks. In a classical JBOD, you’d need a working set of data that is greater than the size of the individual disks (which is highly unlikely) or lots of fragmentation (you really don’t want that). This means the perf that you can actually rely on getting in a JBOD is the perf of the slowest disk, regardless of how many disks there are.

Perf of slowest disk * number of disks > Perf of slowest disk.

QED.

You also assume that disk speeds are somehow vastly different whereas in reality, most modern hard drives perform very similarly.
Also nobody in their right mind would design a system that groups together disks with vastly different performance characteristics when performance is of any importance.