IBM FlashSystem

IBM FlashSystem

Find answers and share expertise on IBM FlashSystem


#Storage
 View Only

What Constitutes a Disk System ?

By Tony Pearson posted Tue July 24, 2007 05:16 AM

  

Originally posted by: TonyPearson


Yesterday, I started this week's topic discussing the various areas of exploration to helpunderstand our recent press release of the IBM System Storage SAN Volume Controller and itsimpressive SPC-1 and SPC-2 benchmark results that ranks it the fastest disk system in the industry.

Some have suggested that since the SVC has a unique design, it should be placed in its own category,and not compared to other disk systems. To address this, I would like to define what IBM meansby "disk system" and how it is comparable to other disk systems.

When I say "disk system", I am going to focus specifically on block-oriented direct-access storage systems, which I will define as:

One or more IT components, connected together, that function as a whole, to serve as a target forread and write requests for specific blocks of data.

Clarification: One could argue, and several do in various comments below, that there are other typesof storage systems that contain disks, some that emulate sequential access tape libraries, some that emulate file-systems through CIFS or NFS protocols, and some that support thestorage of archive objects and other fixed content. At the risk of looking like I may be including or excluding such to fit my purposes, I wanted to avoid apples-to-orangescomparisons between very different access methods. I will limit this exploration to block-oriented, direct-access devices. We can explore these other types of storage systems in later posts.

People who have been working a long time in the storage industry might be satisfied by this definition, thinkingof all the disk systems that would be included by this definition, and recognize that other types of storage liketape systems that are appropriately excluded.

Others might be scratching their heads, thinking to themselves "Huh?" So, I will provide some background, history, and additional explanation. Let's break up the definition into different phrases, and handle each separately.

read and write requests

Let's start with "read and write requests", which we often lump together generically as input/output request, or just I/O request. Typically an I/O request is initiated by a host, over a cable or network, to a target. The target responds with acknowledgment, data, or failure indication. A host can be a server, workstation, personal computer, laptop or other IT device that is capable of initiating such requests, and a target is a device or system designed to receive and respond to such requests.

(An analogy might help. A woman calls the local public library. She picks up the phone, and dials the phone number of the one down the street. A man working at the library hears the phone ring, answers it with "Welcome to the Public Library! How can I help you?" She asks "What is the capital city of Ethiopia?" and replies "Addis Ababa." and hangs up. Satisfied with this response, she hangs up. In this example, the query for information was the I/O request, initiated by the lady, to the public library target)

Today, there are three popular ways I/O requests are made:

  • CCW commands over OEMI, ESCON or FICON cables
  • SCSI commands over SCSI, Fibre Channel or SAS cables
  • SCSI commands over Ethernet cables, wireless or other IP communication methods

specific blocks of data

In 1956, IBM was the first to deliver a disk system. It was different from tape because it was a "direct access storage device" (the acronym DASD is still used today by some mainframe programmers). Tape was a sequential media, so it could handle commands like "read the next block" or "write the next block", it could not directly read without having to read past other blocks to get to it, nor could it write over an existing block without risking overwriting the contents of blocks past it.

The nature of a "block" of data varies. It is represented by a sequence of bytes of specific length. The length is determined in a variety of ways.

  • CCW commands assume a Count-Key-Data (CKD) format for disk, meaning that tracks are fixed in size, but that a track can consist of one or more blocks, and can be fixed or variable in length. Some blocks can span off the end of one track, and over to another track. Typical block sizes in this case are 8000 to 22000 bytes.
  • SCSI commands assume a Fixed-Block-Architecture (FBA) format for disk, where all blocks are the same size, almost always a power of two, such as 512 or 4096 bytes. A few operating systems, however, such as i5/OS on IBM System i machines, use a block size that doesn't follow this power-of-two rule.

one or more IT components

You may find one or more of the following IT components in a disk system:

  • customized or general-purpose processing chips
  • memory, such as RAM, Flash, or similar
  • batteries and/or other power supply
  • Host attachment cards or ports
  • motorized platter(s) covered in magnetic coating with a read/write head to move over its surface. These are often referred to as Hard Disk Drive (HDD) or Disk Drive Modules (DDM), and are manufacturedby companies like Seagate or Hitachi Global Storage Technologies.

A set of HDD can be accessed individually, affectionately known as JBOD for Just-a-bunch-of-disk, or collectively in a RAID configuration.

Memory can act as the high-speed cache in front of slower storage, or as the storage itself. For example, the solid state disk that IBM announced last week is entirely memory storage, using Flash technology.

Lately, there are two popular packaging methods for disk systems:

  • Monolithic -- all the components you need connected together inside a big refrigerator-sized unit, with options to attach additional frames. The IBM System Storage DS8000, EMC Symmetrix DMX-4 and HDS TagmaStore USP-V all fit this category.
  • Modular -- components that fit into standard 19-inch racks, often the size of the vegetable drawer inside a refrigerator, that can be connected externally with other components, if necessary, to make a complete disk system. The IBM System Storage DS6000, DS4000, and DS3000 series, as well as our SVC and N series, fall into this category.

Regardless of packaging, the general design is that a "controller" receives a request from its host attachment port, and uses its processors and cache storage to either satisfy the request, or pass the request to the appropriate HDD,and the results are sent back through the host attachment port.

In all of the monolithic systems, as well as some of the modular ones, the controller and HDD storage are contained in the same unit. On other modular systems, the controller is one system, and the HDD storage is in a separate system, and they are cabled together.



serve as a target

The last part is that a disk system must be able to satisfy some or all requests that come to it.

(Using the same analogy used above, when the lady asked her question, the guy at the public library knew the answer from memory, and replied immediately. However, for other questions, he might need to look up the answer in a book, do a search on the internet, or call another library on her behalf.)

Some disk systems are cache-only controllers. For these, either the I/O request is satisfied as a read-hit or write-hit in cache, or it is not, and has to go to the HDD. The IBM DS4800 and N series gateways are examples of this type of controller.

Other systems may have controller and disk, but support additional disk attachment. In this case, either the I/O request is handled by the cache or internal disk, or it has to go out to external HDD to satisfy the request. IBM DS3000 series, DS4100, DS4700, and our N series appliance models, all fall into this category.

So, the SAN Volume Controller is a disk system comprising of one to four node-pairs. Each node is a piece of IT equipment that have processors and cache. These node-pairs are connected to a pair of UPS power supplies to protect the cache memory holding writes that have not yet been de-staged. The combination of node-pairs and UPS acting as a whole, is able to serve as a target to SCSI commands sent over Fibre Channel cables on a Storage Area Network (SAN). To read some blocks of data, it uses its internal cache storage to satisfy the request, and for others, it goes out to external disk systems that contain the data required. All writes are satisfied immediately in cache on the SVC, and later de-staged to external disk when appropriate.

As of end of 2Q07, having reached our four-year anniversary for this product, IBM has sold over 9000 SVC nodes, which are part of more than 3100 SVC disk systems. These things are flying off the shelves, clocking in a 100% YTY growth over the amount we sold twelve months ago. Congratulations go to the SVC development team for their impressive feat of engineering that is starting to catch the attention of many customers and return astounding results!

So, now that I have explained why the SVC is considered a disk system, tomorrow I'll discuss metrics to measure performance.

technorati tags: , , , , , , , , , , , , , , , , , , , , , , , , ,

8 comments
8 views

Permalink

Comments

Thu July 26, 2007 02:03 AM

Blog post updated. Added clarification that I will accept that there are other access protocols out there, and other forms of storage systems that contain disk that others might call "disk systems". I will focus on block-oriented devices that support direct-access I/O requests. This should still be broad enough definition to include the majority of disk systems available in the market today.

Wed July 25, 2007 11:15 PM

John asks:Could you please take a moment to restate or clarify what differentiates the two types of targets, say a DS4800 from a DS4700? Is the difference merely a physical separation of cache from HDDs on an enclosure level?
In both cases, you go through a controller and satisfy the I/O request either with cache or HDD.
In the case of the DS4800, if the I/O request can be satisfied with cache, you're done, and if not, it goes outside the box, to expansion drawers like the EXP420 or EXP810 where the HDD are stored.
In the case of the DS4700, if the I/O reqeust can be satisfied with cache or the HDD inside the DS4700 enclosure, you're done, and if not, it goes outside the box, to expansion drawers like the EXP420 or EXP810 where additional HDD are stored.
So the real difference is just packaging, and whether cables are inside or outside each enclosure.

Wed July 25, 2007 11:08 PM

Tmasteen asks: But there is an option to disable the cache of the vdisks. So when do you want to do this?
The SVC does allow you to turn cache off for specific virtual disks (strictly speaking, as with all modern disk systems, all I/O passes thru cache but is not retained in cache if cache is "turned off"; writes are destaged immediately in the order they arrive). Customers might do this for two reasons:
Reason 1:wanting to conserve cache space for more important applications. Generally speaking, this is not a good reason to turn cache off since automated cache management, which is what disk systems usually do, is almost always better than making a manual decision about cache management. If you absolutely know that a specific workload using any amount of the cache will hurt other applications that are more important to you, than this is an option.
Reason 2:Wanting to use replication functions of the underlying disk system. Most people buy SVC because they want common consistent network-based replication functions. But they might need to keep using some replication functions of the disk systems, most likely during a migration period or when SVC is part of a large composite application that combines data from other platforms. To ensure consistent data states in the underlying disk system when its replication functions are used, you need to turn off SVC caching.
So technically, when I said that the SVC caches all writes, I should have added "... unless you ask the SVC not to, for that particular virtual disk."

Wed July 25, 2007 11:01 PM

I have had several people say that my "definition" eliminates other storage that may contain HDD, such as virtual tape libraries VTL, NAS-only filers, and archive/compliance storage like the IBM DR550, IBM N series with SnapLock, and EMC Centera. I made an effort to define disk systems broad enough that they included the majority of disk systems that support business applications, either as raw logical volumes, or a variety of file systems from each operating system, that could replace one for the other in such comparisons.
It would be more fair to compare VTL with physical tape libraries, commands such as read-next-block and write-next-block are handled in a similar manner between these two.
It would be more fair to compare NAS-only filers with combinations of OS-specific file systems plus disk system. How fast an application and read or write files, create files, delete files, and move or copy files can be measured between NAS and other file systems.
It would be more fair to compare archive/compliance storage from applications that write to such devices. How quickly such a device can storage an object, whether it is a row in a database table, an e-mail attachment, or an entire file, can be compared with other similar devices.
So, while comparisons can be made in these other areas, different metrics and measurement approaches are required.

Wed July 25, 2007 01:28 PM

"serve as a target" question.
Could you please take a moment to restate or clarify what differentiates the two types of targets, say a DS4800 from a DS4700? Is the difference merely a physical separation of cache from HDDs on an enclosure level?

Wed July 25, 2007 01:24 PM

Some question about the SVC:
qoute of blog: All writes are satisfied immediately in cache on the SVC, and later de-staged to external disk when appropriate.
But there is an option to disable the cache of the vdisks. So when do you want to do this?

Wed July 25, 2007 11:24 AM

Exactly! NetApp, and by extension IBM System Storage N series, that support FCP and iSCSI access, fall into this definition of "disk system". Applications that read/write specific blocks of data, either with applications like databases that access blocks directly, or OS-specific file systems that write directories, files and other internal information on specific blocks, would be able to use these in either FCP or iSCSI mode. Thus, these FCP/iSCSI-enabled boxes could be compared with other FCP/iSCSI-enabled boxes.
Devices that only supported file-level protocols, NFS, CIFS, HTTP or FTP, would not provide this support. Reading 4096 bytes of a file is different than reading 4096 bytes of a block, because the first involves additionally reading various directory structures and writing various statistics that file systems do above the layer of block level devices.

Wed July 25, 2007 10:27 AM

So, by your definition, is a netapps filer a disk system? What about a straight NAS? The first allows for block (specific scsi) and file level access, and the second does file level only.