File and Object Storage

 View Only

IBM ESS 3500 - The Simplest and Fastest Way to Deploy a Global Data Platform for Unstructured Data

By Matthew Geiser posted Mon May 30, 2022 01:39 AM

  

Introduction

As data fabric initiatives evolve, the need for a comprehensive approach for storing, securing and managing file and object data becomes an imperative for organizations that generate and rely on unstructured data as part of their business.  On May 17th, IBM introduced the Elastic Storage System (ESS) 3500, which offers organizations the fastest and simplest way to deploy and benefit from a global data platform for unstructured data.

Two dynamics are driving the need for a global data platform for file and object data.

First, rapid advancements in application development, especially the emergence of machine learning and AI use cases, highlight the need for a unified and consistent approach to accessing data.  As data fabric initiatives continue evolving, a global data platform for unstructured data becomes an essential part of the overall solution.  You’ve heard the saying “No AI without IA” or no artificial intelligence without an information architecture.  In a similar fashion, you cannot have a true data fabric solution without a global data platform for file and object data

Technology dynamics creating the need for a global data platform for unstructured data

Another dynamic… As organizations consider data fabric strategies and the diverse IT infrastructure options, from edge to core data center to public cloud – the need for a global data platform becomes more evident.  ‘Islands’ of unstructured data emerge which highlights the need for a single source of truth that facilitates secure data access while eliminating data redundancy and inconsistencies.

IBM’s global data platform solution powered by Spectrum Scale offers a framework of four essential and differentiated data services.  Data Access Services, Data Caching Services, Data Management Services and Data Security Services.
Data services framework of a global data platform for unstructured data


IBM's global data platform offers a comprehensive data services framework for unstructured file and object data.

  • Data Access Services – unified shared file and object data access to any unstructured data storage system
  • Data Caching Services – Getting the right data to the right application at the right time, without making unnecessary copies of data
  • Data Management Services – Continuous scanning and cataloging all unstructured data to provide visibility, control, and automation of an organizations policies regarding governance, compliance and retention of file and object data
  • Data Security Services – Comprehensive tools and capabilities to identify and detect threats to protect an organizations file and object data with essential response and recovery capabilities when security breaches occur.
A global data platform for unstructured data should offer essential data services

Data Access Services

Access to the same data, at the same time, using whichever protocol the workload requires

Organizations have a variety of use cases, workloads, and applications to run.  Therefore, a global data platform must “speak the language” of each application.  Data access must be “multi-lingual” meaning some applications will create and access data with a certain protocol, and others may require access to the same data with a different protocol. 

For example, an Enterprise IT application may require NFS to access the data, whereas an AL/ML application may need high performance and require S3 or POSIX.  Traditionally, those workloads would be unable to speak to each other.  The global data platform eliminates this by providing multiprotocol data access.  In this case, data can be written with NFS and accessed by S3.

The global data platform also maintains a single source of truth for the data, such that the user does not have to worry about version control.   Spectrum Scale supports the following protocols:

  • File Access – POSIX, Linux/UNIX – NFS, Windows - SMB
  • Object Access - S3, Hadoop/HDFS
  • Containers Access – Container Storage Interface (CSI)
  • GPU-Direct Access – GPU-Direct Storage (GDS)

Data Caching Services

A global data platform must provide data access independent from where the data resides -without creating copies of the data.  For example, when cloud-native applications require high performance access to data stored in an S3 bucket, a global data platform transparently executes “vertical data caching” and automatically fetches the data.  Likewise, for cloud-bursting or data collaboration use cases, the global data platform transparently executes “horizontal data caching”, making data available to public cloud infrastructure or other locations. Again – without copying the data and acting as a local cache.


  • Data Virtualization Services: Integrate legacy file and object data stores into a single file system to breakdown legacy data silos, creating a High-Performance Tier for analytics

  • Data Collaboration Services: Integrate legacy file and object data stores into a single file system to breakdown legacy data silos, also creating a High-Performance Tier for analytics

  • Hybrid Cloud Bursting Services: Enabling bursting to public cloud or remote sites enabling access to all data at a home site, while fetching data transparently either pre-fetched or on demand

  • Data Resiliency Services: Enabling a disaster Recovery solution for business continuity including active-passive site relationship with failover and automatic data reconciliation on failback

Data Management Services

A global data platform must provide visibility, control and automation facilitating data orchestration and policy-driven data life cycle management.  With the diverse IT infrastructure options mentioned earlier, finding the correct data for the business analytics, machine learning and AI use cases can become extremely challenging.


IBM’s global data platform provides not just the visibility to the single source of truth, but also has the capability to ensure data is automatically moved to the most effective storage tier based on an organization’s policies regarding cost, performance, data access, etc.

  • Data Orchestration Services
  • Data Lifecycle Services
  • Data Retention and Archiving Services

Data Security Services

A global data platform must provide security and cyber resiliency for effective detection and prevention of cyber security attacks and near instant recovery of critical data in the event of a successful attack.

  • Identify Services: Analyze the storage environment to identify any issues in your cyber resiliency posture
  • Protect Services: Protect your storage infrastructure with Safeguarded copies and a CyberVault to automatically scan and look for corruption and Identifies ransomware attacks when they have started
  • Detect Services: Detect anomalies in the environment that may be a precursor to a ransomware attack
  • Respond Services: Take automated action upon threat detection to secure data from attack
  • Recover Services: Instant access and instant restore to quickly recover from a cyber attack

ESS 3500

The Elastic Storage System offers the simplest and fasted method for deploying a global data platform, and in May, IBM launched an exciting new model.  The ESS 3500 continues breaking barriers with industry leading performance and scalability.

It leverages the power of Spectrum Scale and NVMe/flash technology to deliver the ultimate high-performance storage for AI, data analytics and high-performance computing use cases.

From its 2U-24 NVMe drive form factor, ESS 3500 can provide up to 91 gigabytes per second (GB/s) of throughput performance with low latency to optimize all types of I/O patterns and data layouts. With the high cost of data science and GPU resources, ESS 3500 delivers data to applications faster and for GPU-accelerated use cases, it helps ensure maximum utilization of GPU resources.


Learn more at:
https://www.ibm.com/storage/artificial-intelligence

0 comments
11 views

Permalink