Skip navigation

Main Navigation:

RAID & Direct Attached Disk Systems

When tuning any system, you can address speed, price, or quality; and of these three, you can optimise for only two. Thus, cheap, fast systems will be of low quality; fast, good systems will never be cheap; and inexpensive, good systems are never fast. RAID storage systems combine many smaller, inexpensive disks to form larger, logical drives. Different RAID configurations can provide more storage, faster performance, or improved reliability, depending on your needs. RAID levels 1 to 5 were originally described in a paper published at Berkeley University in 1988 by researchers Patterson, Gibson and Katz. RAID levels 0 and 0+1 were added by the computer industry. There is not an optimum RAID level - you just have to consider whether you want to trade speed for security of data.

There are number of different RAID levels (most common in bold):

  • Level 0 -- Striped Disk Array without Fault Tolerance: Provides data striping (spreading out blocks of each file across multiple disk drives) but no redundancy. This improves performance but does not deliver fault tolerance. If one drive fails then all data in the array is lost.
  • Level 1 -- Mirroring and Duplexing: Provides disk mirroring. Level 1 provides twice the read transaction rate of single disks and the same write transaction rate as single disks.
  • Level 2 -- Error-Correcting Coding: Not a typical implementation and rarely used, Level 2 stripes data at the bit level rather than the block level.
  • Level 3 -- Bit-Interleaved Parity: Provides byte-level striping with a dedicated parity disk. Level 3, which cannot service simultaneous multiple requests, also is rarely used.
  • Level 4 -- Dedicated Parity Drive: A commonly used implementation of RAID, Level 4 provides block-level striping (like Level 0) with a parity disk. If a data disk fails, the parity data is used to create a replacement disk. A disadvantage to Level 4 is that the parity disk can create write bottlenecks.
  • Level 5 -- Block Interleaved Distributed Parity: Provides data striping at the byte level and also stripe error correction information. This results in excellent performance and good fault tolerance. Level 5 is one of the most popular implementations of RAID.
  • Level 6 -- Independent Data Disks with Double Parity: Provides block-level striping with parity data distributed across all disks.
  • Level 0+1 – A Mirror of Stripes: Not one of the original RAID levels, two RAID 0 stripes are created, and a RAID 1 mirror is created over them. Used for both replicating and sharing data among disks.
  • Level 10 – A Stripe of Mirrors: Not one of the original RAID levels, multiple RAID 1 mirrors are created, and a RAID 0 stripe is created over these.
  • Level 7 - A trademark of Storage Computer Corporation that adds caching to Levels 3 or 4.
  • RAID S - EMC Corporation's proprietary striped parity RAID system used in its Symmetrix storage systems.


Software RAID
The biggest concern with software RAID is that almost the entire configuration burden is left with the system administrator. A reliable RAID volume is not assembled by slapping together just any set of sub disks; you need to carefully select and combine sub disks to ensure that you are not saturating controllers or inadvertently creating single points of failure in your disks subsystem. For example, any two plexes can be associated to create a mirrored volume, but if the sub disks in the first plex are from the same physical device as those in the other, the loss of that device will ruin both plexes and the mirrored volume will fail.

Software RAID is less expensive than a hardware solution, but you'll make up the difference in administrative costs. Still, it can be a good solution, especially for small installations. If you are deploying a server with, say, six drives, which you intend to mirror into three logical volumes, software RAID is a perfect way to create that configuration.

Hardware RAID
Hardware RAID systems take all the processing and moves it into dedicated processors separate from the server. Typically, a hardware RAID device has a large number of physical drives in it, along with a processor, a lot of firmware, and some amount of cache memory. The device is then attached to the host machine via SCSI or fibre channel technology. The host system sees a number of devices on the attached RAID system that it treats as physical disk drives. These devices are actually logical devices created from the physical drives and managed by the internal processor in the RAID system.

Many vendors market hardware RAID solutions and while pricing and architecture differ dramatically between vendors, the basic architectural components - drives, processor, and cache - are the same in all devices.

When looking for a hardware RAID system, it is important to understand the internal architecture that unites the disks and processor. There is nothing magic inside a RAID system, and you'll most likely find either a number of SCSI chains sporting from five to seven drives each, or a direct-to-disk fibre channel that attaches the drives in one large bus. The processor accepts I/O requests from the server via the appropriate SCSI or fibre channel connection, maps the request to one or more drives in the system, collects the results from the drives, and passes the results back to the host system.

In most high-performance RAID systems, you'll find a generous amount of cache memory. Writes are placed into the cache and an acknowledgement is sent back to the server as quickly as possible; the data is copied from the cache to the drives at some later point in time. Reads are staged into the cache as they are delivered to the server, and data may be read ahead in the hopes that future reads can be satisfied directly from cache. The more cache in the system, the greater the chance that all I/O operations can be initially handled from cache, resulting in maximum throughput from the server's perspective.

Cache introduces its own risks: a power failure at the wrong moment will corrupt data held in cache that has not yet been written to disk. Good systems have battery backup on their cache to avoid this problem, and often mirror the cache to avoid hardware memory failure. All hardware RAID systems have some sort of configuration tool that lets you create volumes within the unit. Almost all support RAID 0, 1, 0+1, and 5, with varying levels of control regarding how internal drives are mapped into different volumes. In general, hardware systems offer less flexibility than software RAID solutions, but provide more error checking and configuration control to keep you from making configuration mistakes.

A final difference between hardware- and software-based systems is the ease with which you can remove and replace failed drives. Most hardware system have drives that are hot swappable, meaning that they can be spun down and removed while the system continues to run. This is a glaring problem with software-based RAID, which is generally implemented atop JBOD disk farms whose drives are simply plugged into a shared bus. Pulling a drive requires shutting down the entire bus, which usually in turn requires a brief system outage. If you cannot tolerate downtime to replace failed drives, you should consider only hardware-based RAID systems.

Obviously, hardware-based RAID is more expensive than software-based systems. If you have the budget to buy a hardware solution, however, then you ought to do it. The improved performance and generally higher reliability are well worth the investment.

SCSI and the origin of RAID
To better understand RAID it is worthwhile looking at SCSI (Small Computer System Interface) technology. The Shugart Associates System Interface (SASI) was used as the basis for developing the initial standard - ANSI X3T9.2 (aka X3T10) in 1979. Popularised on Apple's Macintosh Plus in 1984, SCSI worked through the 8-bit parallel port. Published as an ANSI standard (X3.131-1986) in 1986, SCSI offered a fast throughput (for the time) of 5Mbps. Over the years the original SCSI-I command set has been superceded. Data transfer rates increased with Fast SCSI (also called SCSI-II) offering 10Mbps while a 16-bit version known as Fast/Wide SCSI increased this to 20Mbps. Ultra Wide doubled this again to 40Mbps.

SCSI-I and SCSI-II allowed up to 7 devices to be chained together while Ultra SCSI allows up to 15. However as data transfer rates increased, maximum cable length decreased due to signal degradation. The next incarnation - Ultra2 SCSI - again doubled performance with a throughput of 80Mbps and because of a change in the signalling technology; it increased overall SCSI chain length to 12m. On September 14, 1998, Ultra 3 SCSI was introduced, increasing transfer rates to 160Mb per second

SCSI's massive performance increases have been necessitated by a corresponding increase in capacity of disk drives as users demand more storage capacity. The huge volume of disk drive production has also driven down cost and this was one of the main factors that led to the development of RAID - a Redundant Array of Inexpensive Disks grouped together to appear as a single virtual device to the host system. There are several levels of RAID, denoted by number, covering a spectrum of speed, reliability, and price points.

Beyond the potential cost savings, RAID also allows aggregate disk performance to far exceed the speed and throughput of a single device. However, to get the maximum performance from RAID technology when building a big disk subsystem you must understand application characteristics and overall system loading. Properly configured, RAID also tolerates individual device failure, allowing continuous uptime, so it is important to also establish an acceptable service level.



Solution Partners

High Performance Subsystems

Coraid
DotHill
Fujitsu Siemens
Intransa
Hitachi Data Systems (HDS)

High Capacity Subsystems

Coraid
DotHill
Fujitsu Siemens
Intransa
Hitachi Data Systems (HDS)


© Network Attached Storage UK Ltd. 2007

up ^