RAID
RAID stands for Redundant Array of Inexpensive Disks, with “Inexpensive” sometimes replaced with “Independent”. It's about taking several possibly slow and unreliable disks and putting them together to get something that is faster and more reliable than any individual disk, and has more total capacity. Often the RAID array will have a hardware controller that handles the details of which disks to write to when. This can be done by software or by the OS itself, but using a controller is much faster.
There are several different RAID levels, which are designed for performance, reliability, or a combination of both. We will cover the important ones below. For each of the levels below, we will consider how a file will be spread across the individual disks of the RAID array. The file itself is broken into blocks, which are equal sized pieces of the file, usually around 1 to 4 kilobytes each, depending on the system.
For each RAID level, we will be interested in three things: (1) redundancy (how it copes with the failure of a disk), (2) capacity (how much storage space we get), and (3) I/O speed (how fast reading and writing to the array is).
Level 0 (Striping)
Suppose we have a file that consists of 16 blocks, numbered 1 to 16, and suppose we have 4 disks in our array. Below is a table showing how the blocks are spread across the disks. Each of the 4 columns is a different disk.
1 | 2 | 3 | 4 |
1 | 2 | 3 | 4 |
5 | 6 | 7 | 8 |
9 | 10 | 11 | 12 |
13 | 14 | 15 | 16 |
- Redundancy — None. If any disk fails, then we lose part of the file.
- Capacity — Best possible. In the example above, if the disks are all the same size, then the capacity is 4 times the capacity of a single disk.
- Speed — Best possible. We can write to the disks in parallel. For instance, in the example above, while we are writing Block 5 to Disk #1, we can be writing Block 6 to Disk #2, Block 7 to Disk #3 and Block 8 to Disk #4. This is 4 times faster than writing all the blocks to a single disk.
Level 0 is all about performance instead of reliability. It's what you use if you want maximum space and speed and don't care if you lose data. Each disk you add increases the capacity and the speed, but it also increases the chances that one of the drives will fail. For instance, if you have 100 drives, each with a 1% chance of failure in the next month, there's a pretty good chance that at least one of them will fail (around 63%).
Level 1 (Mirroring)
Suppose we have 3 disks and a 4-block file. Here is the table showing how the blocks are spread across the disks.
1 | 2 | 3 |
1 | 1 | 1 |
2 | 2 | 2 |
3 | 3 | 3 |
4 | 4 | 4 |
- Redundancy — Best possible. Each disk is a copy of the others. We won't lose data unless every disk fails.
- Capacity — No gain. The capacity is the same as the capacity of a single disk.
- Speed — No gain. The writes all happen in parallel, so this is not any slower than a single disk, but it's also no faster.
Level 1 is what you use if all you care about is reliability and not performance.
Level 10 or 1+0
Level 10 (or Level 1+0) is a combination of Level 0 and Level 1, which is where it gets its name. The file is striped across some of the disks, and some of the disks are mirrors of others. Suppose we have 4 disks and a file of 6 blocks. Below is the table showing how the blocks are spread across the disks.
1 | 2 | 3 | 4 |
1 | 1 | 2 | 2 |
3 | 3 | 4 | 4 |
5 | 5 | 6 | 6 |
1 | 2 | 3 | 4 | 5 | 6 |
1 | 1 | 2 | 2 | 3 | 3 |
3 | 3 | 4 | 4 | 5 | 5 |
1 | 1 | 2 | 2 | 8 | 8 |
1 | 2 | 3 | 4 | 5 | 6 |
1 | 1 | 1 | 2 | 2 | 2 |
3 | 3 | 3 | 4 | 4 | 4 |
5 | 5 | 5 | 6 | 6 | 6 |
7 | 7 | 7 | 8 | 8 | 8 |
- Redundancy — In four-disk example, we can recover from losing one disk, and possibly two, depending on which disks fail.
- Capacity — In the four-disk example, we get double the capacity of a single disk.
- Speed — Since the writes can happen in parallel, in four-disk example, we get a two times speedup.
Level 10 is a nice combination of performance and reliability. It is used for databases and web servers.
Level 4
Levels 2 and 3 are rarely used, so we will skip them. Level 4 is also rarely used, but its concepts are important for Levels 5 and 6, so we will cover it. Below is a table for a file with 16 blocks, where we use 5 disks.
1 | 2 | 3 | 4 | 5 |
1 | 2 | 3 | 4 | P1 |
5 | 6 | 7 | 8 | P2 |
9 | 10 | 11 | 12 | P3 |
13 | 14 | 15 | 16 | P4 |
1 | 2 | 3 | 4 | 5 |
1 | 0 | 1 | 1 | 1 |
0 | 0 | 0 | 1 | 1 |
1 | 0 | 0 | 1 | 0 |
1 | 1 | 1 | 1 | 0 |
Here is how that parity is used to recover data: Consider the first row from the table above, and suppose that Disk #3 went down and we need to recover its contents, as shown below.
1 | 2 | 3 | 4 | 5 |
1 | 0 | ? | 1 | 1 |
1 | 2 | 3 | 4 | 5 |
1011 | 0111 | 1010 | 1000 | 1110 |
Level 5 and 6
As mentioned, Level 4 is not much used. The reason is that the parity blocks are all stored on the same disk, which means when writing, you lose parallelism. For instance, referring back to the first table in the Level 4 section, suppose you needed to write just blocks 1, 6, 11, and 16. Those blocks are all on different disks, so you could do all of those in parallel, meaning that the total amount of time is no slower than writing to a single disk. However, each of those writes affects the parity, and we would have to write that parity to Disk #5 four times, one after another, meaning the overall write time is no different than writing to a single disk four times, one after the other. Levels 5 and 6 fix this by spreading the parity around. Below is what Level 5 looks like with a 16-block file and 5 disks.
1 | 2 | 3 | 4 | 5 |
1 | 2 | 3 | 4 | P1 |
5 | 6 | 7 | P2 | 8 |
9 | 10 | P3 | 11 | 12 |
13 | P4 | 14 | 15 | 16 |
Level 6 is like Level 5 except that there are two parity blocks. Here is an example with a 18-block file and 5 disks.
1 | 2 | 3 | 4 | 5 |
1 | 2 | 3 | P1 | P′1 |
4 | 5 | P2 | P′2 | 6 |
7 | P3 | P′3 | 8 | 9 |
P4 | P′4 | 10 | 11 | 12 |
P′5 | 13 | 14 | 15 | P5 |
16 | 17 | 18 | P6 | P′6 |
- Redundancy — If a disk goes down, we can use the others to reconstruct the missing data. Level 5 can handle at most one disk failure, while Level 6 can handle two.
- Capacity — Good. For Level 5 with 5 disks, 1/5 of the data is used for parity and 4/5 for holding data, so we get 4 times the capacity of a single disk. For Level 6, it's 2/5 for parity and 3/5 for data, so we get 3 times the capacity of a single disk.
- Speed — Good for reading since the reads can be done in parallel. Writes are somewhat slower because of the parity computations. Level 6 is slower than Level 5 because there is more parity stuff to do.
There are also Levels 50 and 60 that are combinations of Levels 5 and 6 with Level 0, in a similar way to how Level 10 is a combination of Level 1 with Level 0.
Levels 5 and 6 provide is a nice tradeoff between reliability and performance.
Overall summary
The most popular levels seem to be Level 10 and Level 6. If you're choosing between the different levels, read up online to see which one best suit your use case. Level 10 with 6 or more disks and Level 6 both can withstand losing 2 disks, which may be important if you are worried about losing data. Often if one disk fails, another one might not be far behind. The rebuilding process can take many hours, especially with Level 6, and it's possible another disk could fail in that time.