Author: Thomas Stober - TSTOBER@de.ibm.com
This article gives an overview of IBM Fusion, its underlying infrastructure as well as deployment options.
Storage for multi-architecture hybrid cloud topologies
Data and AI solutions constitutes of massive amounts of unstructured data, primarily in the form of files or objects that must be dealt with by IT. The challenges include - providing adequate amounts of storage in hybrid cloud topologies on demand as well as ensuring sophisticated data protection. Containerized workloads add another set of needs for seamless integration.
- Scale and perform from small to extremely large deployments. Storage needs to be efficient, offering caching but avoiding unnecessary data copies.
- Share data across boundaries and computing units and networks in a way, which makes the data consumable by the applications.
- Ensure resilience such as, procedures for disaster recovery.
- Manage local, distributed and cloud storage in a consistent way.
In such solutions one can’t simply plug hard drives into boxes. Well, yes, eventually somebody needs to do this somewhere. But that is not the point. The point is to create a fabric of shared logical storage units. Software-defined storage is all about virtualizing storage and making it available to the workload which fulfils dynamic business needs. This versatile fabric adds a level of abstraction to decouple the physical hardware, which can reside anywhere, from the consuming workload. This fabric can span virtual machines, hardware, networks, and even computer architectures. It creates the flexible storage foundation for true hybrid workloads.
The advantage of software-defined storage is to provide a storage that survives disasters, scales with the customer workload, and is simple to provision for use. Software-defined storage also provides the capability of sharing of data across system boundaries across distributed applications, like containerized Red Hat OpenShift Container Platform workload.
Persistent Volume Claims and Persistent Volumes
In Kubernetes environments like Red Hat OpenShift, you can achieve the abstraction of physical storage using a simple approach:
- The developer defines ‘persistent volume claims’ (PVCs) to request the storage resources he needs for his applications. This PVC is an abstraction of the storage, which is available to applications, and declares how much and what kind storage is needed by the consumer. PVC is a common Kubernetes concept, from which container-native storage takes advantage.
- The storage administrator provisions storage for the cluster as ‘persistent volumes’ (PV). PV resources can be shared across the entire Red Hat OpenShift cluster and claimed from any project.
- Under the hoods there will be a dynamic binding of the claims to the actual available storage. This maps the physical storage to the needs of the developer. Each PV represents a piece of existing storage that was either statically provisioned by the cluster administrator or dynamically provisioned. Note: A PV can be bound to only one PVC at a time.
This simple fundamental concept of container-native storage is depicted in Figure 1. This concept is the foundation of a software-defined storage infrastructure as offered by IBM® products like IBM Fusion.

Figure 1: Using persistent volumes in Kubernetes
IBM Fusion
Let us now have a closer look on IBM’s flagship offering for container-native software-defined storage: IBM Fusion. IBM Fusion is based on Red Hat® OpenShift® and includes two options for storage infrastructure, which can be combined as needed:
- The first option is IBM Storage Scale, which is based on GPFS file Storage.
- The other option is IBM Fusion Data Foundation, which is also known as Red Hat OpenShift Data Foundation and is based on open-source Ceph® storage.
IBM Fusion focuses on consumability, for example it features common operational procedures for disaster recovery or deployment, regardless which storage infrastructure you have chosen.
In addition, IBM Fusion adds services of its own. For example, backup/restore service, which can protect the data stored in the underlying infrastructure.
It’s worth mentioning that IBM Fusion as an IBM product includes licenses to use both infrastructure options and allows you deploy both IBM Storage Scale as well as IBM Fusion Data Foundation in the same datacenter. There is support for IBM Z and LinuxONE as well as for IBM zCX Foundation for Red Hat OpenShift, in case you want to run this software directly on IBM z/OS®.
Let us look closer at the software stack of IBM Fusion:
First, as you can see in the red box of figure 2, all of Fusion is running in a cluster of Red Hat OpenShift. Fusion is deployed as a set of Red Hat OpenShift operators.

Figure 2: IBM Fusion
The core of Fusion are the 2 options for the actual storage infrastructure: IBM Storage Scale and IBM Fusion Data Foundation.
There are some notable differences between these options:
The actual IBM Storage Scale storage is always deployed outside of Fusion as externally shared storage. Inside Fusion there is a client component called IBM Storage Scale Container Native Storage Access (CNSA). CNSA links the external Storage Scale storage to the Red Hat OpenShift cluster and allows to use Storage Scale as container-native storage.
IBM Fusion Data Foundation can be deployed either internally inside the Red Hat OpenShift cluster of IBM Fusion, or it can be deployed using external (shared) IBM Storage Ceph storage. Please note, that this external Ceph storage deployment is only available on x86 systems.
While Storage Scale offers file storage only, Fusion Data Foundation provides file, block, and object storage to the consuming applications.
One of the key value-adds of IBM Fusion is a growing set of additional services on top of the storage infrastructure.
Most notable is a powerful backup/restore service, which safe-guards data in both storage infrastructure options in the same user-friendly way. The coordination of backups is provided by a centralized “hub-server”, which makes a multi-cluster orchestration of data protection much easier than before. In each cluster, a lightweight “spoke” service acts as a local agent and performs the actual backup/restore of persistent volumes and application meta data. A powerful aspect of the backup/restore service is its ability to scope backup archives specifically to induvial application namespaces. Interoperability is a tremendous achievement of this concept, as backups can be exchanged across different cluster deployments and Cluster versions as well as across different cloud storage providers. This makes it a true hybrid cloud solution.
Both the backup hub and the spoke agent are fully supported on IBM Z and LinuxONE.
In summary, IBM Fusion exposes container-native software-defined storage to applications by leveraging proven IBM Storage technologies and enriching them with additional capabilities and a user-friendly interface. Administrators as data conductors or developers as data consumers who are familiar with Kubernetes, will easily find their way in this IBM offering.
Using IBM Storage Scale infrastructure
IBM Storage Scale excels when dealing with large amount of unstructured data, which is globally distributed. The perfect scenarios for AI, Big Data Analytics, or any other high-performance workload. IBM Storage Scale just scales amazingly well!
As a global data platform, Scale makes data available wherever its needed, without redundant copies, which would be hard to manage. Nice features allow to define multiple tiers of data access. Disaster recovery and high availability are well thought-through. Another aspect to highlight is the large number of supported protocols, which can be used by application when dealing with storage.
Here is an attempt to describe Storage Scale in one sentence:
“High performance parallel data access with enterprise data services connecting edge to core to public cloud in a single federated cluster”
Using IBM Fusion Data Foundation infrastructure
The key values of IBM Fusion Data Foundation, also known as Red Hat OpenShift Data Foundation, are ease-of-use and simplicity, as it is based on standard Kubernetes skills. Fusion Data Foundation works the same on any platform, which makes it perfect for hybrid cloud solutions. It also supports all three relevant storage classes object, file, and block storage. As Fusion Data Foundation is part of the Red Hat OpenShift family, it is nicely integrated into Red Hat OpenShift ecosystem – both from administration as well as development point of view.
Let’s try a one sentence summary of Fusion Data Foundations value proposition:
“Federate underlying storage hardware to an abstracted repository, which is easy to provision and consume.”
Running IBM Fusion on IBM Z and LinuxONE hardware
One more consideration when choosing your software-defined storage infrastructure, is the choice of the hardware platform.
While IBM Fusion is available on several hardware platforms with more or less the same set of features, there are clear benefits when running on IBM Z and LinuxONE:
- The resilience of IBM Z and LinuxONE is second to none, by featuring 99.99999% platform uptime, due to hardware, which is designed for reliability.
- Compliance, sustainability, and multi tenancy by EAL5+ certified virtualization adds to the value-adds of the platforms.
- By consolidating multiple applications on a single versatile server, administration and orchestration becomes much simpler and more efficient.
- For containerized workload using container-native storage, the IBM Z and LinuxONE platforms excel with impressive vertical as well as scalability. You can easily scale up to millions of containers and thousands of Linux guests in one physical server to meet even unexpected peak load.
- In combination with Red Hat OpenShift, the administration of your workload is based on standard Kubernetes practices and hides hardware specifics. You can take advantage of all benefits of the splendid platforms, while IBM Fusion can be used in a platform agnostic way with common industry skills.
Building hybrid topologies
As mentioned before, software-defined storage combines multiple storage units to a single coherent logical storage infrastructure even across platforms and architectures.
Let us look at usage patterns for Fusion Data Foundation (FDF) on IBM Z and LinuxONE first (see figure 3).
One option is to leverage IBM Fusion HCI appliance (HCI = hyperconverged infrastructure) as a container-native storage provider and consume that storage. For example, using Red Hat OpenShift workload.
Another option is to deploy an IBM Storage Ceph cluster as a shared storage provider (on x86 environments) and consume that storage from Red Hat OpenShift with IBM Fusion Data Foundation using the so called “external mode”.
And finally, IBM Fusion Data Foundation allows to consume external object storage, which for example can be offered by a cloud provider like IBM Cloud® Object Storage or AWS. The support for S3 Object storage interface is provided by the Multicloud gateway of Noobaa, which is included in IBM Fusion.

Figure 3: Distributed multi-architecture topologies based on IBM Fusion Data Foundation
Hybrid topologies combining multiple storage units to a single coherent logical storage infrastructure can be built on IBM Z and LinuxONE using IBM Storage Scale as in a very similar fashion.
Again, one option is to leverage IBM Fusion HCI appliance as a container native storage provider and consume that storage. For example, using Red Hat OpenShift workload.
Another option is to deploy the external IBM Storage Scale cluster as a storage provider and consume that storage in Red Hat OpenShift with the CNSA operator. Notable is, that the IBM Storage Scale cluster itself can also span multiple architectures.

Figure 4: Distributed multi-architecture topologies based on IBM Storage Scale
Deploying IBM Fusion on IBM Z and LinuxONE

Figure 5: Deployment patterns for IBM Fusion on IBM Z and LinuxONE
The cluster on the left side of figure 5 includes an external IBM Storage Scale cluster combined with CNSA. Please note that the Storage Scale nodes run directly on Linux on IBM Z / LinuxONE, while CNSA is running of compute nodes inside the Red Hat OpenShift cluster.
The middle cluster includes three storage nodes of Fusion Data Foundation running inside a Red Hat OpenShift cluster.
Both options can run entirely inside your IBM Z and LinuxONE environment. You can choose freely to assign the involved node on dedicated LPARs or IBM z/VM® / KVM hypervisors. This gives you a large amount of flexibility.
The cluster on the right side of the combines the external storage appliance IBM Fusion HCI with a Red Hat OpenShift cluster running on IBM Z and LinuxONE.
More details can be found in the product documentation ibm.com/products/storage-fusion and the blog IBM Fusion Data Foundation on IBM Z and IBM LinuxONE LinuxONE.