] raises some interesting points and questions in the comments section about the new IBM XIV Nextra architecture.I answer these below not just for the benefit of my friends at EMC, but also for my own colleagues within IBM,IBM Business Partners, Analysts and clients that might have similar questions.
- If RAID 5/6 makes sense on every other platform, why not so on the Web 2.0 platform?
BarryB writes:
"Your attempt to justify the expense of Mirrored vs. RAID 5 makes no sense to me. Buying two drives for every one drive's worth of usable capacity is expensive, even with SATA drives. Isn't that why you offer RAID 5 and RAID 6 on the storage arrays that you sell with SATA drives?And if RAID 5/6 makes sense on every other platform, why not so on the (extremely cost-sensitive) Web 2.0 platform? Is faster rebuild really worth the cost of 40+% more spindles? Or is the overhead of RAID 6 really too much for those low-cost commodity servers to handle."
Let's take a look at various disk configurations, for example 3TB on 750GB SATA drives:- JBOD: 4 drives
- JBOD here is industry slang for "Just a Bunch of Disks" and was invented as the term for "non-RAID".Each drive would be accessible independently, at native single-drive speed, with no data protection. Putting four drives in a single cabinet like this provides simplicity and convenience only over four separate drives in their own enclosures.
- RAID-10: 8 drives
- RAID-10 is a combination of RAID-1 (mirroring) and RAID-0 (striping). In a 4x2 configuration, data is striped across disks 1-4,then these are mirrored across to disks 5-8. You get performance improvement and protection against a single drive failure.
- RAID-5: 5 drives
- This would be a 4+P configuration, where there would be four drives' worth of data scattered across five drives. This gives you almost the same performance improvement as RAID-10, similar protection against single drive failure, but with fewer drives per usable TB capacity.
- RAID-6: 6 drives
- This would be a 4+2P configuration, where the first P represents linear parity, and the second represents a diagonal parity. Similar in performance improvement as RAID-5, but protects against single and double drive failures, and still better than RAID-10 in terms of drives per TB usable capacity.
For all the RAID configurations, rebuild would require a spare drive, but often spares are shared among multiple RAID ranks, not dedicated to a single rank. To this end, you often have to have several spares per I/O loop, and a different set of spares for each kind of speed and capacity. If you had a mix of 15K/73GB, 10K/146GB, and 7200/500GB drives, then you would have three sets of spares to match.
In contrast, IBM XIV's innovative RAID-X approach doesn't require any spare drives, just spare capacity on existing drives being used to hold data. The objects can be mirrored between any two types of drives, so no need to match one with another.
All of these RAID levels represent some trade-off between cost, protection and performance, and IBM offers each of theseon various disk systems platforms. Calculating parity is more complicated than just mirrored copies, but this can be done with specialized chips in cache memory to minimize performance impact.IBM generally recommends RAID-5 for high-performance FC disk, and RAID-6 for slower, large capacity SATA disk.
However, the question assumes that the drive cost is a large portion of the overall "disk system" cost. It isn't. For example,Jon Toigo discusses the cost of EMC's new AX4 disk system in his post [National Storage Rip-Off Day]:
- EMC is releasing its low end CLARiiON AX4 SAS/SATA array with 3TB capacity for $8600. It ships with four 750GB SATA drives (which you and I could buy at list for $239 per unit). So, if the disk drives cost $956 (presumably far less for EMC), that means buyers of the EMC wares are paying about $7700 for a tin case, a controller/backplane, and a 4Gbps iSCSI or FC connector. Hmm.
- Dell is offering EMC’s AX4-5 with same configuration for $13,000 adding a 24/7 warranty.
(Note: I checked these numbers. $8599 is the list price that EMC has on its own website. External 750GB drives available at my local Circuit City ranged from $189 to $329 list price. I could not find anything on Dell's own website, but found [The Register] to confirm the $13,000 with 24x7 warranty figure.)
Disk capacity is a shrinking portion of the total cost of ownership (TCO). In addition to capacity, you are paying for cache, microcode and electronics of the system itself, along with software and services that are included in the mix,and your own storage administrators to deal with configuration and management. For more on this, see [XIV storage - Low Total Cost of Ownership].
- EMC Centera has been doing this exact type of blob striping and protection since 2002
BarryB writes:
"As I've noted before, there's nothing 'magic' about it - Centera has been employing the same type of object-level replication for years. Only EMC's engineers have figured out how to do RAID protection instead of mirroring to keep the hardware costs low while not sacrificing availability."
I agree that IBM XIV was not the first to do an object-level architecture, but it was one of the first to apply object-level technologies to the particular "use case" and "intended workload" of Web 2.0 applications.
RAID-5 based EMC Centera was designed instead to hold fixed-content data that needed to be protected for a specific period of time, such as to meet government regulatory compliance requirements. This is data that you most likely will never look at again unless you are hit with a lawsuit or investigation. For this reason, it is important to get it on the cheapest storage configuration as possible. Before EMC Centera, customers stored this data on WORM tape and optical media, so EMC came up with a disk-only alternative offering.IBM System Storage DR550 offers disk-level access for the most recent archives, with the ability to migrate to much less expensive tape for the long term retention. The end result is that storing on a blended disk-plus-tape solution can help reduce the cost by a factor of 5x to 7x, making RAID level discussion meaningless in this environment. For more on this, see my post [OptimizingData Retention and Archiving].
While both the Centera and DR550 are based on SATA, neither are designed for Web 2.0 platforms. When EMC comes out with their own "me, too" version, they will probably make a similar argument.
- IBM XIV Nextra is not a DS8000 replacement
BarryB opines:
"Nextra is anything but Enterprise-class storage, much less a DS8000 replacement. How silly of all those folks to suggest such a thing."
I did searches on the Web and could not find anybody, other than EMC employees, who suggested that IBM XIV Nextra architecture represented a replacement for IBM System Storage DS8000. The IBM XIV press release does not mention or imply this, and certainly nobody I know at IBM has suggested this.
The DS8000 is designed for a different "use case" and set of "intended workloads" than what the IBM XIV was designed for. The DS8000 is the most popular disk system for centralized computing, such as our IBM System i and System z mainframe platforms, offering features like 528-byte block sizes, Count-Key-Data (CKD) volumes, and FICON host attachment. For this client segment of the marketplace, IBM is #1 for high-end disks. As long as IBM is successful with System i and System z mainframes, it will continue to offer disk storage like IBM DS8000 that addresses these requirements.
However, for the market segment of clients that run primarily with distributed computing, like Windows, Linux, AIX, HP-UX or Solaris, we find that traditionally IBM has not been #1 in that segment. These clients, including Web 2.0 companies, can now choose IBM XIV Nextra. The IBM XIV Nextra is a Tier-1, high-end, enterprise-class disk system that serves as an ideal replacement for EMC Symmetrix, HDS USP-V, and whatever Hewlett-Packard calls their re-badged Hitachi gear these days.
This is not an either/or discussion. For example, Web 2.0 companies might choose IBM XIV Nextra for their digital content, and other operations on DS8000. There is no reason a storage vendor must limit themselves to a single high-end disk offering. IBM will have 2, 3 or more high-end disk systems, as many as needed to address the various needs of different market segments. Different storage for different purposes.
- Given drive growth rates have slowed, improving utilization is mandatory to keep up with 60-70 percent CAGR
BarryB writes:
"Look around you, Tony- all of your competitors are implementing thin provisioning specifically to drive physical utilization upwards towards 60-80%, and that's on top of RAID 5/RAID 6 storage and not RAID 1. Given that disk drive growth rates and $/GB cost savings have slowed significantly, improving utilization is mandatory just to keep up with the 60-70% CAGR of information growth."
Disk drive capacities have slowed for FC disk because much of the attention and investment has been re-directed to ATA technology. Dollar-per-GB price reduction is slowing for disks in general, as researchers are hitting physical limitations to the amount of bits they can pack per square inch of disk media, and is now around 25 percent per year.The 60-70 percent Compound Annual Growth Rate (CAGR) is real, and can be even growing faster for Web 2.0providers. While hardware costs drop, the big ticket items to watch will be software, services and storage administrator labor costs.
To this end, IBM XIV Nextra offers thin provisioning and differential space-efficient snapshots. It is designed for 60-90 percent utilization, and can be expanded to larger capacities non-disruptively in a very scalable manner.
Well, I hope that helps clear some things up.