File and Object Storage

 View Only

Spectrum Scale Data Access Services (DAS): New S3 access for AI and Analytics workloads

By Ulf Troppens posted Fri May 27, 2022 10:27 AM

  
Today IBM released the new IBM Spectrum Scale Data Access Services (DAS) [1] S3 which provides remote access to data which is stored in Spectrum Scale filesystems [3] using the S3 protocol. Spectrum Scale DAS extends Spectrum Scale container native [4] and seamlessly integrates in Spectrum Scale’s existing configuration and management mechanisms.

The initial release of Spectrum Scale DAS is optimized for AI and analytics workloads. S3 objects and S3 buckets are mapped 1:1 to files and directories in Spectrum Scale filesystems and vice versa. A Spectrum Scale filesystem provides the storage capacity for the object data. For the initial release, all data must be created, processed, and deleted using the S3 object access protocol.

The performance of Spectrum Scale DAS is highly dependent on your underlying infrastructure and your workload. IBM published the following benchmark results for Spectrum Scale DAS [2]:
  • COSBench using objects with a size of 1GB running against a three-node Spectrum Scale Data Access Services cluster and using IBM Elastic Storage System 3200 as back-end storage:
    • More than 60 GB/s aggregated throughput for read workloads,
    • More than 20 GB/s aggregated throughput for write workloads.

To achieve optimal performance, Spectrum Scale DAS requires a dedicated compact OpenShift cluster, IBM Elastic Storage Systems (ESS) [5] and high-speed networks. S3 applications run on collocated, separate servers using any operating system or any Kubernetes platform.

Spectrum Scale DAS Example Deployment

Compact OpenShift clusters are three-node OpenShift clusters, where each OpenShift node acts as combined OpenShift master and OpenShift worker node [7]. Spectrum Scale DAS requires the OpenShift cluster to be configured with Spectrum Scale container native [3] and Spectrum Scale Container Storage Interface [4]. The Spectrum Scale container native cluster imports (remotely mounts) one Spectrum Scale filesystem, which is provided by a collocated Spectrum Scale storage cluster.

Spectrum Scale DAS stores S3 objects and S3 buckets as files and directories in the Spectrum Scale filesystem, which is owned by the Spectrum Scale Storage cluster. In this way Spectrum Scale DAS inherits Spectrum Scale’s built-in data management capabilities such as integration of storage media with varying performance and capacity into the same filesystem (e.g., NVMe, SSD and NL-SAS), policy-based data placement and data movement, and backup and restore of object data by using the Spectrum Scale mmbackup function.

For more information on Spectrum Scale DAS, see the docs [1].


References

[1] IBM Spectrum Scale Data Access Services (DAS) 5.1.3

[2] Performance results for Spectrum Scale DAS 5.1.3

[3] IBM Spectrum Scale

[4] IBM Spectrum Scale Container Native

[5] IBM Spectrum Scale Container Storage Interface (CSI) Driver

[6] IBM Elastic Storage System (ESS)

[7] Red Hat OpenShift: Delivering a Three-node Architecture for Edge Deployments
0 comments
33 views

Permalink