Block Storage With RAID
This article covers the best practices for implementing software RAID on your SpinUp Cloud Servers by using SpinUp Block Storage volumes, focusing on the most common RAID types used in production environments.
The content covered in this article assumes you have an introductory understanding of disaster recovery, fault tolerance, and redundancy concepts for deploying applications to cloud environments.
Why use RAID?
At some point, you are likely to encounter some kind of failure scenario. With Block Storage, for example, issues might come up with the physical host servers that house your storage volume instances. In these cases, any volumes you have on that physical host are entirely inaccessible while the host is down. When the host is brought back up, your storage volume might also be in read-only mode, which prevents you from writing to the volume.
You typically have to wait until any issues with the host are sorted out before you can address any data concerns. That is valuable time, and for mission critical applications, this downtime can be quite damaging. RAID gives you another option to prevent downtime and the loss of data that you keep on storage volumes.
Implementing software RAID with multiple SpinUp Block Storage volumes provides an additional layer of fault tolerance in the event of a host-wide incident. Certain RAID implementations also produce an overall improvement in performance.
Please note that RAID is not a substitute for a fully viable backup solution. You need to ensure that you have a good backup and restore system in place to have a reference point to roll back to in cases of disk corruption.
Overview of RAID types
This section covers the basic features of the RAID type, minimum disk requirements, and performance or redundancy trade offs. It includes diagrams of each configuration to help illustrate the differences between RAID types.
Minimum number of disks required: 2
RAID 0 uses striping to read and write files across multiple drives.
Striping distributes or splits the data evenly across multiple hard drives that are configured in the RAID 0 array. This distribution enables improved performance because multiple drives are being harnessed to read and write data.
The major drawback for RAID 0 is that it provides no redundancy. The data is striped across the two drives, so if a disaster causes either one of the drives to suffer irreparable damage, the data is lost. RAID 0 should not be used by itself in any critical production system.
Minimum number of disks required: 2
RAID 1 introduces disk mirroring, which clones data from one disk to another.
This means you have at least 2 drives with an identical data set. This is important because RAID 1 introduces redundancy. If you lose one disk to catastrophic failure, you still have the entire data set available on the other disk.
However, RAID 1 doesn’t have the advantage of striping, so you don’t get any performance benefit with RAID 1.
Minimum number of disks required: 3
RAID 5 uses a technique called parity.
Parity is essentially extra data that is stored on a single drive in each disk configuration to make up for any lost data (failed drives). So in this case, you have three disks, and as part of the drive configuration, one of the drives on each disk is designated as parity data. The files you have stored in this RAID configuration are striped across the three disks with parity left over in cases where data is unreadable or otherwise lost.
RAID 5 provides both more redundancy than RAID 0 and better performance than RAID 1 because of striping. With this minimum three disk configuration, you can lose any one disk and still retain all of your data. There are some performance drawbacks due to the addition of parity, however. Write speeds, in particular, might suffer due to parity calculations.
Minimum number of disks required: 4
RAID 10 combines striping and mirroring (RAID 1 and RAID 0 combined).
This provides excellent redundancy and performance for your system. RAID 10 requires a minimum of four disks that are split into spans, which are typically groups of two. The data is striped across the disks and mirrored in each span.
With RAID 10, you get the best of both worlds in terms of performance and redundancy. The only downside is that this option is, by far, the most expensive.
RAID has two primary purposes:
Improve performance of disk operations by distributing the data across multiple hard drives.
Prevent data loss and downtime using multiple hard drives to withstand drive failures.
When making a case for which RAID type to use, you should carefully consider your applications’ needs. Not every application or platform is going to require an overly robust RAID configuration.
As mentioned previously, if you want to ensure you have solid fault tolerance with performance benefits, RAID 10 likely stands out as the only viable option. It is going to be the most expensive option, but if you are implementing any kind of mission critical application, RAID 10 is the best bet to keep your data fault tolerant.
Be sure to remember that RAID is not a substitute for having backups in place, so you still need to be sure you are using appropriate methods to keep your data backed up and tested regularly.
For specific instructions on how to configure RAID, check the documentation for your server’s operating system.