Introduction
As data fabric initiatives evolve, the need for a comprehensive approach for storing, securing and managing file and object data becomes an imperative for organizations that generate and rely on unstructured data as part of their business. On May 17th, IBM introduced the Elastic Storage System (ESS) 3500, which offers organizations the fastest and simplest way to deploy and benefit from a global data platform for unstructured data.
Two dynamics are driving the need for a global data platform for file and object data.
First, rapid advancements in application development, especially the emergence of machine learning and AI use cases, highlight the need for a unified and consistent approach to accessing data. As data fabric initiatives continue evolving, a global data platform for unstructured data becomes an essential part of the overall solution. You’ve heard the saying “No AI without IA” or no artificial intelligence without an information architecture. In a similar fashion, you cannot have a true data fabric solution without a global data platform for file and object data
Another dynamic… As organizations consider data fabric strategies and the diverse IT infrastructure options, from edge to core data center to public cloud – the need for a global data platform becomes more evident. ‘Islands’ of unstructured data emerge which highlights the need for a single source of truth that facilitates secure data access while eliminating data redundancy and inconsistencies.
IBM’s global data platform solution powered by Spectrum Scale offers a framework of four essential and differentiated data services. Data Access Services, Data Caching Services, Data Management Services and Data Security Services.
IBM's global data platform offers a comprehensive data services framework for unstructured file and object data.
- Data Access Services – unified shared file and object data access to any unstructured data storage system
- Data Caching Services – Getting the right data to the right application at the right time, without making unnecessary copies of data
- Data Management Services – Continuous scanning and cataloging all unstructured data to provide visibility, control, and automation of an organizations policies regarding governance, compliance and retention of file and object data
- Data Security Services – Comprehensive tools and capabilities to identify and detect threats to protect an organizations file and object data with essential response and recovery capabilities when security breaches occur.
Data Access Services
Access to the same data, at the same time, using whichever protocol the workload requires
Organizations have a variety of use cases, workloads, and applications to run. Therefore, a global data platform must “speak the language” of each application. Data access must be “multi-lingual” meaning some applications will create and access data with a certain protocol, and others may require access to the same data with a different protocol.
For example, an Enterprise IT application may require NFS to access the data, whereas an AL/ML application may need high performance and require S3 or POSIX. Traditionally, those workloads would be unable to speak to each other. The global data platform eliminates this by providing multiprotocol data access. In this case, data can be written with NFS and accessed by S3.
Data Caching Services
A global data platform must provide data access independent from where the data resides -without creating copies of the data. For example, when cloud-native applications require high performance access to data stored in an S3 bucket, a global data platform transparently executes “vertical data caching” and automatically fetches the data. Likewise, for cloud-bursting or data collaboration use cases, the global data platform transparently executes “horizontal data caching”, making data available to public cloud infrastructure or other locations. Again – without copying the data and acting as a local cache.
- Data Virtualization Services: Integrate legacy file and object data stores into a single file system to breakdown legacy data silos, creating a High-Performance Tier for analytics
- Data Collaboration Services: Integrate legacy file and object data stores into a single file system to breakdown legacy data silos, also creating a High-Performance Tier for analytics
- Hybrid Cloud Bursting Services: Enabling bursting to public cloud or remote sites enabling access to all data at a home site, while fetching data transparently either pre-fetched or on demand
- Data Resiliency Services: Enabling a disaster Recovery solution for business continuity including active-passive site relationship with failover and automatic data reconciliation on failback
Data Management Services
A global data platform must provide visibility, control and automation facilitating data orchestration and policy-driven data life cycle management. With the diverse IT infrastructure options mentioned earlier, finding the correct data for the business analytics, machine learning and AI use cases can become extremely challenging.