That’s right in 1956, companies were gladly paying $640,000 per month per GB for spinning data storage on IBM RMAC. RMAC was only available on a lease basis, with the best support any company could provide. Physical tape was considerably less expensive, but the advancement into the age of nearline storage necessitated invention. This was the “death of tape”, which was predicted to be less than 10 years away.
Fast forward 70 years, both HDD and physical tape are still critical to the global retention of data. Regardless of the mechanism that collects the data, such as “cloud” all data ends up on HDDs or/and on tape. The importance of tape to data preservation and retention is even more important today than ever before. The sheer magnitude of data and the value associated with the data means deleting data that has not been verified as “of no use”, becomes lost opportunity. If tape is so important, why is the capacity shipped each year so much less than HDD?
In a world obsessed with continuous evaluation of company and product performance, down to the week and day in some instances, momentary measurements are often confused with overall engagement. The 12-month view of capacity shipments is an easily collected data set for storage types, memory, SSD, HDD and tape, we could even throw in optical if we wanted to evaluate consumer data mediums. However, the 12-month cycle boundary is distorted by the lifecycle and use cases of the storage mediums.
Flash and SSDs continue to demonstrate increased shipment capacity. This is being driven by larger unit capacity and consumer adoption. Very few computers come with HDD, given the capacity requirements of most office and home systems require less than a Terabyte of storage. Flash is further bolstered by the consumer portable device and IoT markets. The volume of units shipped is the big contributor to total yearly capacity shipments, but most of these devices have lives less than 4 years. SSDs are also heavily utilized by crypto miners, where an SSD has a life of as few as 40 days! The shorter the lifecyles the more shipments are required to keep systems running during the same time period.
HDDs have fallen out of favor for consumers, which started around 2009, and is clearly visible in the unit shipments. HDDs have also increased capacity significantly over the last 12 years, so fewer units can store more data. HDDs are also less frequently used for higher performance operations as the number of spindles used are much higher than when using Flash/SSD technology, making the later ultimately more cost efficient. HDDs are still the mainstay of the enterprise world for data storage on network Attached storage, block storage and bulk storage of data, object storage was essentially designed for HDDs. The large conglomerates for compute, “Hyperscalers”, have shored up the units purchased each year, while also driving down the price of HDDs through shrewd negotiations. The challenges of natural physics that faces the HDD industry over the next decade may change the trajectory of units and costs as HDD capacity per unit increases begin to appear at a slower pace. HDD lifecycles tend to be between 4 to 5 years. Refresh cycles have been influenced by HDD unit fall-out rates, which begin to climb after year 5.
Tape has been an enterprise only choice for more than 2 decades. Why? Tape came into maturity in the Gartner® hypecycle® in the 1990s, the capacity and use cases did not fit the consumer space. However, the extreme cost differential of tape compared to HDD have made tape a mainstay for data retention in the enterprise and for hyperscale companies. Even as the price of HDD plummeted, tape remains 4 to 5 times lower in total cost of ownership.
That brings us back to the question, why is the capacity shipped each year so much less than HDD? Tape media has an average field life of 9.2 years, with some very large customers using a single tape media for up to 12 years, reasonably. In a standard archive scenario retaining data for 10 years HDDs consume nearly 2 times the number of HDDs with 3 times the data management. (table 1). A Contributing factor to why HDD has continued high volumes of sales and capacity. Even in the archive storage space.
Table1 . Compare of HDD and Tape media for 27PB, 8% CAGR, 10-year retention, 4 year HDD refresh cycle, HDD published roadmap, Single generation update for tape, no migration.
As the data indicates, HDD refresh requirements and usage as transient data storage provides an answer to the question, tape is more efficient in nearly every manner (sustainability ) when compared to HDDs in the archive data space, with few media being replaced each year for tape, the real measurement is the capacity for a rolling time period of years not months.
Conclusion, from the moment data is created it is on a journey, if the data is to be retained for any period longer than 1 year, the journey will end up on tape. The idea that all data will reside on a single type of digital storage media is feasible but comes with a high cost.
Cost in acquisition, cost in operation and a cost in sustainable IT. Misperceptions around the ease of use of tape are often from legacy operational experience or simply no experience. Data on tape is handled much the same as block level HDD communications, with the difference being tape is a linear media that requires ordered operations, a direction that is now being embraced with SMR HDD archives. Regardless of the market reports for any measurement period SSD, HDD and Tape will all be around for a long time. The world depends on all of these technologies to continue to deliver the performance, availability and durability of data, the global resource. As we approach another end of year, what’s old is new, and tape is experiencing another year of growth, while the cost to store data on tape shrinks to 1/100th of a penny per GB month.