Understanding RAID Concepts
Guide
Organizing Data Storage for Availability and Performance
Choosing RAID Levels and Concatenation
Comparing RAID Level and Concatenation Performance
Storage Management provides storage management using RAID (Redundant Array of Independent Disks) technology. Understanding storage management requires an understanding of RAID concepts, as well as some familiarity with how your system's RAID controllers and operating system view disk space.
What Is RAID?
RAID (Redundant Array of Independent Disks) is a technology for managing how data is stored on the physical disks that reside in your system or are attached to it. A key aspect of RAID is the ability to span physical disks so that the combined storage capacity of multiple physical disks can be treated as a single, extended chunk of disk space. Another key aspect of RAID is the ability to maintain redundant data which can be used to restore data in the event of a disk failure. RAID uses different techniques, such as striping, mirroring, and parity, to store and reconstruct data. There are different RAID levels that use different methods for storing and reconstructing data. The RAID levels have different characteristics in terms of read/write performance, data protection, and storage capacity. Not all RAID levels maintain redundant data, which means for some RAID levels lost data cannot be restored. Which RAID level you choose depends on whether your priority is performance, protection, or storage capacity.
NOTE: The RAID Advisory Board (RAB) defines the specifications used to implement RAID. Although the RAID Advisory Board (RAB) defines the RAID levels, commercial implementation of RAID levels by different vendors may vary from the actual RAID specifications. An implementation used by a particular vendor may affect the read and write performance and the degree of data redundancy. |
Hardware and Software RAID
RAID can be implemented with either hardware or software. A system using hardware RAID has a RAID controller that implements the RAID levels and processes data reads and writes to the physical disks. When using software RAID, the operating system must implement the RAID levels. For this reason, using software RAID by itself can slow system performance. You can, however, use software RAID on top of hardware RAID volumes to provide greater performance and variety in the configuration of RAID volumes. For example, you can mirror a pair of hardware RAID 5 volumes across two RAID controllers to provide RAID controller redundancy.
NOTE: This release of Storage Management only supports hardware RAID. |
RAID Concepts
RAID uses particular techniques for writing data to disks. These techniques enable RAID to provide data redundancy or better performance. These techniques include:
- Mirroring — Duplicating data from one physical disk to another physical disk. Mirroring provides data redundancy by maintaining two copies of the same data on different physical disks. If one of the disks in the mirror fails, the system can continue to operate using the unaffected disk. Both sides of the mirror contain the same data at all times. Either side of the mirror can act as the operational side. A mirrored RAID disk group is comparable in performance to a RAID 5 disk group in read operations but faster in write operations.
- Striping — Disk striping writes data across all physical disks in a virtual disk. Each stripe consists of consecutive virtual disk data addresses that are mapped in fixed-size units to each physical disk in the virtual disk using a sequential pattern. For example, if the virtual disk includes five physical disks, the stripe writes data to physical disks one through five without repeating any of the physical disks. The amount of space consumed by a stripe is the same on each physical disk. The portion of a stripe that resides on a physical disk is a stripe element. Striping by itself does not provide data redundancy. Striping in combination with parity does provide data redundancy.
- Stripe size — The total disk space consumed by a stripe not including a parity disk. For example, consider a stripe that contains 64KB of disk space and has 16KB of data residing on each disk in the stripe. In this case, the stripe size is 64KB and the stripe element size is 16KB.
- Stripe element — A stripe element is the portion of a stripe that resides on a single physical disk.
- Stripe element size — The amount of disk space consumed by a stripe element. For example, consider a stripe that contains 64KB of disk space and has 16KB of data residing on each disk in the stripe. In this case, the stripe element size is 16KB and the stripe size is 64KB.
- Parity — Parity refers to redundant data that is maintained using an algorithm in combination with striping. When one of the striped disks fails, the data can be reconstructed from the parity information using the algorithm.
- Span — A span is a RAID technique used to combine storage space from groups of physical disks into a RAID 10 or 50 virtual disk.
RAID Levels
Each RAID level uses some combination of mirroring, striping, and parity to provide data redundancy or improved read and write performance. For specific information on each RAID level, see "Choosing RAID Levels and Concatenation."
Organizing Data Storage for Availability and Performance
RAID provides different methods or RAID levels for organizing the disk storage. Some RAID levels maintain redundant data so that you can restore data after a disk failure. Different RAID levels may also entail an increase or decrease in the system's I/O (read and write) performance.
Maintaining redundant data requires the use of additional physical disks. As more disks become involved, the likelihood of a disk failure increases. Because of the differences in I/O performance and redundancy, one RAID level may be more appropriate than another based on the applications in the operating environment and the nature of the data being stored.
When choosing concatenation or a RAID level, the following performance and cost considerations apply:
- Availability or fault-tolerance. Availability or fault-tolerance refers to a system's ability to maintain operations and provide access to data even when one of its components has failed. In RAID volumes, availability or fault-tolerance is achieved by maintaining redundant data. Redundant data includes mirrors (duplicate data) and parity information (reconstructing data using an algorithm).
- Performance. Read and write performance can be increased or decreased depending on the RAID level you choose. Some RAID levels may be more appropriate for particular applications.
- Cost efficiency. Maintaining the redundant data or parity information associated with RAID volumes requires additional disk space. In situations where the data is temporary, easily reproduced, or non-essential, the increased cost of data redundancy may not be justified.
- Mean Time Between Failure (MBTF). Using additional disks to maintain data redundancy also increases the chance of disk failure at any given moment. Although this cannot be avoided in situations where redundant data is a requirement, it does have implications for the workload of your organization's system support staff.
For more information, see "Choosing RAID Levels and Concatenation."
Choosing RAID Levels and Concatenation
You can use RAID or concatenation to control data storage on multiple disks. Each RAID level or concatenation has different performance and data protection characteristics.
The following sections provide specific information on how each RAID level or concatenation store data as well as their performance and protection characteristics.
- "Concatenation"
- "RAID Level 0 (Striping)"
- "RAID Level 1 (Mirroring)"
- "RAID Levels 5 (Striping with distributed parity)"
- "RAID Level 50 (Striping over RAID 5 sets)"
- "RAID Level 10 (Striping over mirror sets)"
- "RAID Level 1-Concatenated (Concatenated mirror)"
- "Comparing RAID Level and Concatenation Performance"
Concatenation
In Storage Management, concatenation refers to storing data on either one physical disk or on disk space that spans multiple physical disks. When spanning more than one disk, concatenation enables the operating system to view multiple physical disks as a single disk.
Data stored on a single disk can be considered a simple volume. This disk could also be defined as a virtual disk that comprises only a single physical disk. Data that spans more than one physical disk can be considered a spanned volume. Multiple concatenated disks can also be defined as a virtual disk that comprises more than one physical disk.
A dynamic volume that spans to separate areas of the same disk is also considered concatenated.
When a physical disk in a concatenated or spanned volume fails, the entire volume becomes unavailable. Because the data is not redundant, it cannot be restored by rebuilding from a mirrored disk or parity information. Restoring from a backup is the only option.
Because concatenated volumes do not use disk space to maintain redundant data, they are more cost-efficient than volumes that use mirrors or parity information. A concatenated volume may be a good choice for data that is temporary, easily reproduced, or that does not justify the cost of data redundancy. In addition, a concatenated volume can easily be expanded by adding an additional physical disk.
Figure 3-1. Concatenating Disks
- Concatenates n disks as one large virtual disk with a capacity of n disks.
- Data fills up the first disk before it is written to the second disk.
- No redundancy data is kept. When a disk fails, the large virtual disk fails.
- No performance gain.
- No redundancy.
Related Information:
See the following:
- "Organizing Data Storage for Availability and Performance"
- "Controller-supported RAID Levels"
- "Number of Physical Disks per Virtual Disk"
- "Maximum Number of Virtual Disks per Controller"
RAID Level 0 (Striping)
RAID 0 uses data striping, which is writing data in equal-sized segments across the physical disks. RAID 0 does not provide data redundancy.
RAID 0 Characteristics:
- Groups n disks as one large virtual disk with a capacity of (smallest disk size)*n disks.
- Data is stored to the disks alternately.
- No redundancy data is kept. When a disk fails, the large virtual disk fails with no means of rebuilding the data.
- Better read and write performance.
Related Information:
See the following:
- "Organizing Data Storage for Availability and Performance"
- "Comparing RAID Level and Concatenation Performance"
- "Controller-supported RAID Levels"
- "Number of Physical Disks per Virtual Disk"
- "Maximum Number of Virtual Disks per Controller"
RAID Level 1 (Mirroring)
RAID 1 is the simplest form of maintaining redundant data. In RAID 1, data is mirrored or duplicated on one or more physical disks. If a physical disk on one side of the mirror fails, then the data can be rebuilt using the physical disk on the other side of the mirror.
RAID 1 Characteristics:
- Groups n + n disks as one virtual disk with the capacity of n disks. The controllers currently supported by Storage Management allow the selection of two disks when creating a RAID 1. Because these disks are mirrored, the total storage capacity is equal to one disk.
- Data is replicated on the two disks.
- When a disk fails, the virtual disk still works. The data will be read from the failed disk's mirror.
- Better read performance, but slightly slower write performance.
- Redundancy for protection of data.
- RAID 1 is more expensive in terms of disk space since twice the number of disks are used than required to store the data without redundancy.
Related Information:
See the following:
- "Organizing Data Storage for Availability and Performance"
- "Comparing RAID Level and Concatenation Performance"
- "Controller-supported RAID Levels"
- "Number of Physical Disks per Virtual Disk"
- "Maximum Number of Virtual Disks per Controller"
RAID Levels 5 (Striping with distributed parity)
RAID 5 provides data redundancy by using data striping in combination with parity information. Rather than dedicating a physical disk to parity, however, the parity information is striped across all physical disks in the disk group.
Figure 3-4. Striping Disks with Distributed Parity
RAID 5 Characteristics:
- Groups n disks as one large virtual disk with a capacity of (n-1) disks.
- Redundant information (parity) is alternately stored on all disks.
- When a disk fails, the virtual disk still works, but it is operating in a degraded state. The data is reconstructed from the surviving disks.
- Better read performance, but slower write performance.
- Redundancy for protection of data.
Related Information:
See the following:
- "Organizing Data Storage for Availability and Performance"
- "Comparing RAID Level and Concatenation Performance"
- "Controller-supported RAID Levels"
- "Number of Physical Disks per Virtual Disk"
- "Maximum Number of Virtual Disks per Controller"
RAID Level 50 (Striping over RAID 5 sets)
RAID 50 is striping over more than one span of physical disks. For example, a RAID 5 disk group that is implemented with three physical disks and then continues on with a disk group of three more physical disks would be a RAID 50.
It is possible to implement RAID 50 even when the hardware does not directly support it. In this case, you can implement more than one RAID 5 virtual disks and then convert the RAID 5 disks to dynamic disks. You can then create a dynamic volume that is spanned across all RAID 5 virtual disks.
RAID 50 Characteristics:
- Groups n*s disks as one large virtual disk with a capacity of s*(n-1) disks, where s is the number of spans and n is the number of disks within each span.
- Redundant information (parity) is alternately stored on all disks of each RAID 5 span.
- Better read performance, but slower write performance.
- Requires proportionally as much parity information as standard RAID 5.
- Data is striped across all spans. RAID 50 is more expensive in terms of disk space.
NOTE: On the PERC 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4e/DC, 4/Di, 4e/Si, 4e/Di, and CERC ATA100/4ch controllers, there are special considerations when implementing RAID 50 on a disk group that has disks of different sizes. See "Considerations for RAID 10 and 50 on PERC 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4e/DC, 4/Di, 4e/Si, 4e/Di, and CERC ATA100/4ch Controllers" for more information. |
Related Information:
See the following:
- "Organizing Data Storage for Availability and Performance"
- "Comparing RAID Level and Concatenation Performance"
- "Controller-supported RAID Levels"
- "Number of Physical Disks per Virtual Disk"
- "Maximum Number of Virtual Disks per Controller"
RAID Level 10 (Striping over mirror sets)
The RAID Advisory Board considers RAID Level 10 to be an implementation of RAID level 1. RAID 10 combines mirrored physical disks (RAID 1) with data striping (RAID 0). With RAID 10, data is striped across multiple physical disks. The striped disk group is then mirrored onto another set of physical disks. RAID 10 can be considered a mirror of stripes.
Figure 3-6. Striping Over Mirrored Disk Groups
RAID 10 Characteristics:
- Groups n disks as one large virtual disk with a capacity of (n/2) disks.
- Mirror images of the data are striped across sets of physical disks. This level provides redundancy through mirroring.
- When a disk fails, the virtual disk is still functional. The data will be read from the surviving mirrored disk.
- Improved read performance and write performance.
- Redundancy for protection of data.
NOTE: On the PERC 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4e/DC, 4/Di, 4e/Si, 4e/Di, and CERC ATA100/4ch controllers, there are special considerations when implementing RAID 10 on a disk group that has disks of different sizes. See "Considerations for RAID 10 and 50 on PERC 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4e/DC, 4/Di, 4e/Si, 4e/Di, and CERC ATA100/4ch Controllers" for more information. |
Related Information:
See the following:
- "Organizing Data Storage for Availability and Performance"
- "Comparing RAID Level and Concatenation Performance"
- "Controller-supported RAID Levels"
- "Number of Physical Disks per Virtual Disk"
- "Maximum Number of Virtual Disks per Controller"
RAID Level 1-Concatenated (Concatenated mirror)
RAID 1-concatenated is a RAID 1 disk group that spans across more than a single pair of physical disks. This combines the advantages of concatenation with the redundancy of RAID 1. No striping is involved in this RAID type.
Also, RAID 1 Concatenated can be implemented on hardware that supports only RAID 1 by creating multiple RAID 1 virtual disks, upgrading the virtual disks to dynamic disks, and then using spanning to concatenate all of the RAID 1 virtual disks into one large dynamic volume.
Figure 3-7. RAID 1-Concatenated
NOTE: This RAID level is used only with PERC 3/Si, and PERC 3/Di controllers. |
Related Information:
See the following:
- "Organizing Data Storage for Availability and Performance"
- "Comparing RAID Level and Concatenation Performance"
- "Controller-supported RAID Levels"
- "Number of Physical Disks per Virtual Disk"
- "Maximum Number of Virtual Disks per Controller"
Considerations for RAID 10 and 50 on PERC 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4e/DC, 4/Di, 4e/Si, 4e/Di, and CERC ATA100/4ch Controllers
On the PERC 3/SC, 3/DCL, 3/DC, 3/QC, 4/SC, 4/DC, 4e/DC, 4/Di, 4e/Si, 4e/Di, and CERC ATA100/4ch controllers, there are special considerations when implementing RAID 10 or RAID 50 on a disk group that has disks of different sizes. When implementing RAID 10 or RAID 50, disk space is spanned to create the stripes and mirrors. The span size can vary to accommodate the different disk sizes. There is, however, the possibility that a portion of the largest disk in the disk group will be unusable, resulting in wasted disk space. For example, consider an disk group that has the following disks:
Disk A = 40 GB
Disk B = 40 GB
Disk C = 60 GB
Disk D = 80 GB
In this example, data will be spanned across all four disks until Disk A and Disk B and 40 GB on each of Disk C and D are completely full. Data will then be spanned across Disks C and D until Disk C is full. This leaves 20 GB of disk space remaining on Disk D. Data cannot be written to this disk space, as there is no corresponding disk space available in the disk group to create redundant data.
Comparing RAID Level and Concatenation Performance
The following table compares the performance characteristics associated with the more common RAID levels. This table provides general guidelines for choosing a RAID level. Keep in mind the needs of your particular environment when choosing a RAID level.
NOTE: The following table does not show all RAID levels supported by Storage Management. For information on all RAID levels supported by Storage Management, see "Choosing RAID Levels and Concatenation." |
RAID Level and Concatenation Performance Comparison
No comments:
Post a Comment