File and Object Storage

 View Only

Advanced Static Volume Provisioning with IBM Spectrum Scale on Red Hat OpenShift

By GERO SCHMIDT posted 25 days ago

  

Abstract

As distributed parallel file system, IBM Spectrum Scale can provide a global namespace for all your data, supporting POSIX file access as well as access through various protocols like NFS, SMB, Object, and HDFS. This enables data ingest from a variety of data sources and provides data access to different analytics and big data platforms like High Performance Computing (via POSIX direct file access), Hadoop (via HDFS transparency) and OpenShift (via IBM Spectrum Scale CNSA and IBM Spectrum Scale Container Storage Interface Driver / CSI) without the need to duplicate or copy data from one storage silo to another. This reduces costs (no waste of storage capacity on duplicate data) and time to insights (no waiting for data to be copied).

In OpenShift/Kubernetes persistent storage for containerized applications is consumed through persistent volumes and persistent volume claims. A storage provider in OpenShift typically creates a new and empty persistent volume (PV) in response to a persistent volume claim (PVC) through dynamic provisioning using storage classes. But what about providing and sharing access to existing data in IBM Spectrum Scale and making the data available to containerized applications running in OpenShift - without the need to copy and duplicate the data?

In this article we will take a closer look at how containerized applications in OpenShift can consume persistent storage in IBM Spectrum Scale. Specifically, we will explore how we can leverage static provisioning to provide and share access to existing data in IBM Spectrum Scale across OpenShift user namespaces and discuss some of the necessary considerations that need to be taken into account (like proper mapping of PVs to PVCs using labels or claimRef, SELinux context, etc.).

Disclaimer: Please note that the following statements and examples in this article are no formal support statements by IBM. This article is an exploratory journey to discuss the different ways of volume provisioning in OpenShift/Kubernetes with the IBM Spectrum Scale CSI Driver and how to use standard Kubernetes methodologies to create statically provisioned PVs and bind them to specific PVCs with the goal to provide and share access to existing data hosted in IBM Spectrum Scale. Kubernetes and OpenShift are quickly evolving projects so options and behaviors may change and the IBM Spectrum Scale Driver might need to adapt accordingly. So always make sure to test the proposed options carefully and when necessary obtain a proper support statement from IBM for future directions where needed. The article is based on behaviors observed in OpenShift 4.9.22 and IBM Spectrum Scale CSI Driver 2.4.0.

Table of contents

Basic concepts of volume provisioning in OpenShift/Kubernetes

In OpenShift and Kubernetes the fundamental ways of provisioning persistent storage in form of persistent volumes (PVs) to your containerized applications are

  1. Dynamic provisioning of volumes with a storage class, and
  2. Static provisioning of volumes through the cluster administrator.

With dynamic provisioning the cluster admin only needs to create a storage class and the corresponding PV is automatically created on demand by the CSI driver and bound to the originating persistent volume claim (PVC) of the user and the user's OpenShift namespace (also referred to as project in OpenShift). The PVC can then be used in pods to provide persistent storage to the containerized applications. However, dynamic provisioning generally provides a fresh and empty volume for the user to start with.

Static provisioning, on the other hand, requires the cluster admin to manually create a pool of persistent volumes (PVs) that can be claimed by users through persistent volume claims (PVCs).  A PVC will be bound to a PV based on a best match with regard to requested storage size and access mode. Here, the cluster admin typically would create a pool of PVs that are backed by pre-created empty directories in IBM Spectrum Scale to ensure that each user gets an empty PV bound to the user's PVC request. 

However, static provisioning also allows the cluster admin to create PVs which are actually backed by non-empty directories in IBM Spectrum Scale with existing data. For example, one could think of directories that contain huge amounts of data such as Deep Learning (DL) training data which we actually want to share and make available to a larger group of users in OpenShift to train and run AI/DL models without the need to copy and duplicate the data. Here we can leverage static provisioning to make existing data in IBM Spectrum Scale available to specific or multiple users.

Dynamic volume provisioning

Dynamic volume provisioning was promoted to stable in Kubernetes 1.6 release (see Dynamic Provisioning and Storage Classes in Kubernetes) and is the preferred way of providing persistent storage to users in OpenShift/Kubernetes today. Prior to the availability of dynamic provisioning a cluster admin had to manually pre-provision the storage and create the persistent volume (PV) objects for the users' persistent volume claims (PVCs). With dynamic provisioning the cluster admin only needs to create a storage class (or multiple storage classes) and a new empty volume is automatically provisioned and created on-demand for each user's PVC request. Storage classes use provisioners that are specific to the individual storage backend, here, in this article we focus on the IBM Spectrum Scale Container Storage Interface Driver or, in short, the IBM Spectrum Scale CSI Driver.

IBM Spectrum Scale CSI Driver v2.5.0 supports different storage classes for creating

  • lightweight volumes (backed by directories in IBM Spectrum Scale),
  • fileset-based volumes (backed by independent/dependent filesets in IBM Spectrum Scale) ,
  • consistency group volumes (backed by dependent filesets jointly embedded in an independent fileset for consistent snapshots of all contained volumes).

Please refer to the Storage class section in the IBM Spectrum Scale CSI Driver documentation for more details about these different storage classes. Here will briefly look at the full deployment cycle for using a storage class for lightweight volumes.

The cluster admin only needs to create a storage class. An example for a storage class for lightweight volumes is given below:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: ibm-spectrum-scale-light-sc
provisioner: spectrumscale.csi.ibm.com
parameters:
volBackendFs: "fs1"
volDirBasePath: "pvc-volumes"
reclaimPolicy: Delete

All PVs provisioned from this storage class will be located in individual directories located under [mount point fs1]/pvc-volumes/ in IBM Spectrum Scale. The target directory pvc-volumes in the IBM Spectrum Scale file system fs1 must exist prior to creating the storage class.

A user can now issue a persistent volume claim (PVC) against this storage class ibm-spectrum-scale-light-sc as shown below:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ibm-spectrum-scale-pvc
spec:
storageClassName: ibm-spectrum-scale-light-sc
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
The IBM Spectrum Scale CSI driver  will automatically create a new directory, here pvc-01a53f89-8d14-4862-abe0-98fe6fe57dfc, in the IBM Spectrum Scale file system fs1 under the mount path /mnt/fs1/pvc-volumes/, create a new PV object backed by this directory, and bind this PV to the user's PVC request in the user's namespace (note that a PVC is always a namespaced object):
# oc get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
ibm-spectrum-scale-pvc Bound pvc-01a53f89-8d14-4862-abe0-98fe6fe57dfc 10Gi RWX ibm-spectrum-scale-light-sc 5s

# oc get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-01a53f89-8d14-4862-abe0-98fe6fe57dfc 10Gi RWX Delete Bound user-namespace/ibm-spectrum-scale-pvc ibm-spectrum-scale-light-sc 3s

# ls -al /mnt/fs1/pvc-volumes/
drwxrwx--x. 2 root root 4096 Mar 29 18:26 pvc-01a53f89-8d14-4862-abe0-98fe6fe57dfc
The user can mount and use this PVC in all pods in the user's namespace as shown below, simply by referring to the PVC name, here ibm-spectrum-scale-pvc:
apiVersion: v1
kind: Pod
metadata:
name: ibm-spectrum-scale-test-pod
spec:
containers:
- name: ibm-spectrum-scale-test-pod
image: registry.access.redhat.com/ubi8/ubi-minimal:latest
command: [ "/bin/sh" ]
args: [ "-c","while true; do echo $(hostname) $(date +%Y%m%d-%H:%M:%S) | tee -a /data/stream1.out ; sleep 5 ; done;" ]
volumeMounts:
- name: vol1
mountPath: "/data"
volumes:
- name: vol1
persistentVolumeClaim:
claimName: ibm-spectrum-scale-pvc

In this example, the PVC/PV will be mounted under the local path /data in the pod's container. It is located in the IBM Spectrum Scale at /mnt/fs1/pvc-volumes/pvc-01a53f89-8d14-4862-abe0-98fe6fe57dfc with /mnt/fs1 being the mount point of the IBM Spectrum Scale file system fs1 on the OpenShift cluster nodes.

The default storage class
The cluster admin can define multiple storage classes if needed and mark one of the storage classes as default storage class. In this case any persistent volume claim (PVC) that does not explicitly request a storage class through the storageClassName (i.e. this line is omitted in the PVC manifest) will be provisioned by the default storage class.
An existing storage class can be marked as default storage class as follows:
# oc patch storageclass ibm-spectrum-scale-light-sc -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

# oc get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
ibm-spectrum-scale-light-sc (default) spectrumscale.csi.ibm.com Delete Immediate false 5d18h
ibm-spectrum-scale-sample spectrumscale.csi.ibm.com Delete Immediate false 23d
You can unmark the default storage class with
# oc patch storageclass ibm-spectrum-scale-light-sc -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'

Note that the behavior of static or dynamic provisioning for persistent volume claims (PVCs) may change in the presence of a default storage class!

With no default storage class defined:

  • A PVC with no or an empty storageClassName uses static provisioning and is matched with available PVs from the pool of statically provisioned volumes.
  • A PVC with a provided storageClassName always uses dynamic provisioning with the specified storage class.
With a default storage class defined:
  • A PVC with no storageClassName uses dynamic provisioning with the default storage class.
  • A PVC with an empty ("") storageClassName uses static provisioning and is matched against available PVs from the pool of statically provisioned volumes.
  • A PVC with a provided storageClassName always uses dynamic provisioning with the specified storage class.
We will make use of the empty ("") storageClassName in a PVC in the following sections to make sure that we explicitly request a statically provisioned volume even in the presence of a default storage class.

Static volume provisioning

Static provisioning was the way of providing persistent storage in Kubernetes before dynamic provisioning became generally available. Today, you would typically use dynamic provisioning for the provisioning of new volumes to users. However, static provisioning offers ways of providing and sharing access to specific directories and thus existing data in IBM Spectrum Scale as we will explore in the coming sections. In order to better understand how static provisioning generally works we will briefly walk through the involved steps in the following paragraph.

With static provisioning the cluster admin would need to manually provision the storage (i.e. creating a directory in IBM Spectrum Scale) and create the persistent volume (PV) object as follows:

apiVersion: v1
kind: PersistentVolume
metadata:
name: pv01
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteMany
csi:
driver: spectrumscale.csi.ibm.com
volumeHandle: "835838342966509310;099B6A7A:5EB99721;path=/mnt/fs1/data/pv01"

The volumeHandle given in this example is the original volumeHandle as used up to IBM Spectrum Scale CSI Driver v2.1.0. It is still compatible with IBM Spectrum Scale CSI Driver v2.5.0 for static volumes. The following parameters need to be provided in the volumeHandle:

  • 835838342966509310 is the clusterID of the local (primary) IBM Spectrum Scale CNSA cluster
  • 099B6A7A:5EB99721 is the file system ID of the IBM Spectrum Scale file system
  • /mnt/fs1/data/pv01 is the local path in CNSA (i.e. on the OpenShift nodes) to the backing directory in the specified IBM Spectrum Scale file system.

The cluster admin would typically create a pool of pre-provisioned PVs (e.g., pv01, pv02, ...) each backed by pre-provisioned empty directories in IBM Spectrum Scale to ensure that each user obtains an empty PV bound to the user's PVC request. The IBM Spectrum Scale CSI Driver provides a script to help with the generation of static PVs, see Generating static provisioning manifests.  

The PV that was created in the previous step can be claimed by a user through a regular persistent volume claim (PVC) such as

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ibm-spectrum-scale-pvc
spec:
storageClassName: ""
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi

Typically the storageClassName: "" line could be omitted if no default storage class is present in the OpenShift cluster. We explicitely use an empty ("") storageClassName here in our manifest to ensure that we always skip dynamic provisioning especially in the presence of a default storage class. A PVC with storageClassName: "" is always interpreted to be requesting a PV without dynamic provisioning through a storage class (and the acssociated storage provider).

In general, the PVC will be bound to any available PV from the pool of pre-provisioned PVs based on a best match with regard to requested storage size and access mode. This means that a PV with a larger capacity (e.g. 100Gi instead of a requested 10Gi) and a broader access mode (i.e. RWX instead of the requested RWO) may be matched to a PVC that requests less capacity and a narrower access mode. Claims will remain unbound indefinitely if a matching volume does not exist. They will be bound once matching volumes become available due to declarative nature of Kubernetes/OpenShift. Refer to Binding for more details on the binding between a PV and a PVC.

The access modes in Kubernetes are:

  • ReadWriteOnce (RWO): the volume can be mounted as read-write by a single node. ReadWriteOnce access mode still can allow multiple pods to access the volume when the pods are running on the same node.
  • ReadOnlyMany (ROX): the volume can be mounted as read-only by many nodes.
  • ReadWriteMany (RWX): the volume can be mounted as read-write by many nodes.
  • ReadWriteOncePod (RWOP): the volume can be mounted as read-write by a single pod. Use ReadWriteOncePod access mode if you want to ensure that only one pod across the whole cluster can access this PVC. This is only supported for CSI volumes and Kubernetes version 1.22+.

Note that although the access mode appears to be controlling access to the volume, it is actually used similarly to labels to match a PVC to a proper PV dependent on what the resource provider supports - there are currently no access rules enforced based on the selected accessModes. See Access Modes for more information.

After the PV is bound to a PVC in a user namespace it can be consumed by all pods in that namespace simply referencing the PVC name in the volumes section of the pod manifest as shown in the example above. In our example, the user will obtain a new and empty PV from the pre-provisioned pool of available PVs that best matches the requested criteria and that is exclusively bound to the user's PVC request in the user's namespace. Note that a PVC is a namespaced object, which means it is namespace-bound in contrast to a PV. Once a PVC is deleted the associated static PV is released based on its reclaim policy (see Reclaiming). The default reclaim policy (persistentVolumeReclaimPolicy) is Retain which means that the PV still exists but is in a released state (not in an available state) so it cannot be claimed and bound to another PVC request. The cluster admin needs to manually decide what to do with the released PV and reclaim it (e.g. delete and recreate the PV, delete the user data, etc.).

The IBM Spectrum Scale CSI Driver volumeHandle

In our example and throughout this article we use the original volumeHandle as it was used up to IBM Spectrum Scale CSI Driver v2.1.0, see Creating a persistent volume (PV):

  csi:                                                                                    IBM Spectrum Scale CSI Driver v2.1.0
driver: spectrumscale.csi.ibm.com
volumeHandle: "835838342966509310;099B6A7A:5EB99721;path=/mnt/fs1/data/pv01"

It is still compatible with IBM Spectrum Scale CSI Driver v2.5.0 for static volumes although the volumeHandle itself has changed with release v2.5.0 as follows:

 csi:                                                                                     IBM Spectrum Scale CSI Driver v2.5.0
driver: spectrumscale.csi.ibm.com
volumeHandle: "0;0;835838342966509310;099B6A7A:5EB99721;;;/mnt/fs1/data/pv01"

The new volumeHandle for IBM Spectrum Scale CSI Driver v2.5.0 has introduced additional fields as described in Creating a persistent volume (PV) for v2.5.0 with 0;[Volume type];[Cluster ID];[Filesystem UUID];;[Fileset name];[Path to the directory or fileset linkpath]. For statically provisioned PVs the 1st field is "0" and the 5th field is always empty. Volume type is "0" for directory based volumes, "1" for dependent fileset based volumes and "2" for independent fileset based volumes. For directory based volumes, fileset name is always empty.

In any case, when defining a static PV with IBM Spectrum Scale CSI Driver we require specific information from IBM Spectrum Scale for the volumeHandle:

    volumeHandle: "835838342966509310;099B6A7A:5EB99721;path=/mnt/fs1/data/pv01"
[local cluster ID];[file system UID];[local path to directory]
(1) First we need the cluster ID of the local (primary) IBM Spectrum Scale CNSA cluster that is running on OpenShift. This can be retrieved from any of the IBM Spectrum Scale CNSA core pods (here we pick the pod worker1a in the ibm-spectrum-scale namespace) by executing the mmlscluster command as follows:
# oc exec worker1a  -n ibm-spectrum-scale -- mmlscluster -Y | grep clusterSummary | tail -1 | cut -d':' -f8
Defaulted container "gpfs" out of: gpfs, logs, mmbuildgpl (init), config (init)
835838342966509310
Alternatively, you can also retrieve this information from the IBM Spectrum Scale CSI custom resource (CR) csiscaleoperators.csi.ibm.com in the ibm-spectrum-scale-csi namespace as follows:
# oc get csiscaleoperators.csi.ibm.com ibm-spectrum-scale-csi -n ibm-spectrum-scale-csi -o yaml | grep -A5 " id:"
- id: "835838342966509310"
primary:
inodeLimit: ""
primaryFs: fs1
primaryFset: primary-fileset-fs1-835838342966509310
remoteCluster: "215057217487177715"
--
- id: "215057217487177715"
primary:
inodeLimit: ""
primaryFs: ""
primaryFset: ""
remoteCluster: ""

It will be the entry that has the primaryFs defined (i.e. non-empty), here, primaryFs: fs1, with fs1 being the local IBM Spectrum Scale file system name in the local IBM Spectrum Scale CNSA cluster on OpenShift. Note that IBM Spectrum Scale CNSA will use the local file system name (fs1) and local cluster ID (835838342966509310) also as part of the default name for the primary fileset that will be created and used by the IBM Spectrum Scale CSI Driver, here primary-fileset-fs1-835838342966509310. So you might also be able to tell these parameters from the primary fileset name even on the remote storage cluster:

# mmlsfileset fs1 -L
Filesets in file system 'fs1':
Name Id RootInode ParentId Created InodeSpace MaxInodes AllocInodes Comment
root 0 3 -- Mon May 11 20:19:22 2020 0 15490304 500736 root fileset
primary-fileset-fs1-835838342966509310 8 2621443 0 Fri Mar 11 17:59:18 2022 5 1048576 52224 Fileset created by IBM Container Storage Interface driver

(2) The second parameter that we need is the UID of the IBM Spectrum Scale file system where our target directory for the PV object will be located. We can obtain the UID from any of the IBM Spectrum Scale CNSA core pods (here we pick the pod worker1a in the ibm-spectrum-scale namespace) by executing the mmlsfs command as follows:

# oc exec worker1a -n ibm-spectrum-scale -- mmlsfs fs1 --uid
Defaulted container "gpfs" out of: gpfs, logs, mmbuildgpl (init), config (init)
flag value description
------------------- ------------------------ -----------------------------------
--uid 099B6A7A:5EB99721 File system UID
In this example, our local file system fs1 is remotely mounted from an IBM Spectrum Scale storage cluster, for example, an IBM Elastic Storage Server (ESS), as we can see with the mmremotefs command:
# oc exec worker1a -- mmremotefs show all
Defaulted container "gpfs" out of: gpfs, logs, mmbuildgpl (init), config (init)
Local Name Remote Name Cluster name Mount Point Mount Options Automount Drive Priority
fs1 ess3000_1M ess3000.bda.scale.ibm.com /mnt/fs1 rw yes - 0

We can identify the local path /mnt/fs1 where file system is mounted on all participating OpenShift worker nodes and which we will need in the next step.
Note, if the local file system fs1 is a remote mount from a remote file system, here ess3000_1M, on a remote storage cluster, then both will have the same UID, i.e. running mmlsfs ess3000_1M --uid on the storage cluster will provide the same UID.

(3) The third parameter is the full local path to the destination or target directory in IBM Spectrum Scale that we want to use as backing directory for the PV. This path will be composed of two parts: the local mount point of the file system (/mnt/fs1) and the actual target directory (/data/pv01) within the file system where all the data of the PV will be located. So the complete local path will be /mnt/fs1/data/pv01.

For more details about the regular options for static provisioning with IBM Spectrum Scale CSI driver please refer to Static provisioning in the IBM Spectrum Scale Container Storage Interface Driver documentation.

Advanced static volume provisioning

Now that we understand the workflow for static provisioning with IBM Spectrum Scale CSI Driver, let's take a look at some use cases where static provisioning can help. While we would preferably use dynamic provisioning to provide new (and empty) volumes to users, we can make use of static provisioning if we want to provide or share access to existing data in IBM Spectrum Scale, especially, if we want to use IBM Spectrum Scale as a "Data Lake" for a variety of applications and across different data analytics platforms and architectures without the need to copy and duplicate all the data.

Static provisioning offers ways of providing and sharing access to specific directories and existing data in IBM Spectrum Scale. These could be specific directories that only specific users (i.e. OpenShift namespaces/projects) are allowed to access or shared directories where multiple users should be able to claim access to.

For specific directories accessed only by specific users (i.e. user namespaces) we need to make sure that these PVs can only be bound to specific PVCs in specific namespaces and not by any PVC in any namespace. For this use case we will describe how to work with static PVs using the claimRef option.

Shared directories on the other hand can be directories where huge amounts of data are stored, for example, data to train and run Machine Learning (ML) / Deep Learning (DL) models and where multiple users should be able to easily request access to by simply claiming a static PV from a pre-provisioned pool. Here we will make use of labels attached to the PVs that help to characterize the data (i.e. type: training, dept: ds, etc.) and allow the correct binding between the requested PVC and pre-provisioned PV.

Other use cases may include advanced features of IBM Spectrum Scale, like Active File Management (AFM), where data from a home location (AFM Home cluster) is made available on a remote edge location (AFM Cache cluster). Here, static provisioning can also be used to make data in AFM filesets available to containerized applications in OpenShift.

Advanced static volume provisioning using labels

First, we will look at a use case with shared directories with existing data that we want to make available to multiple user namespaces in OpenShift through statically provisioned PVs. Here, users shall be able to simply claim such a PV from a pre-provisioned pool and also be able to select between different kinds of PVs based on labels that characterize the data behind the PV. Kubernetes labels are key/value pairs that are attached to objects and can be used to select objects based on their labels through label selectors (see Labels and Selectors). 

Let's assume we want to share access to a specific data directory in IBM Spectrum Scale to a group of data scientists. This directory holds huge amounts of data to be processed either for training new models or for applying models and making proper classifications and predictions. Each data scientist works in a private namespace/project on OpenShift. The local path to the destination directory is /mnt/fs1/training-data. And we will characterize the data through two labels, type: training and dept: ds.

The cluster admin would need to manually prepare a pool of PVs (each with a unique PV name like train-pv01, train-pv02, train-pv03,...) with the following persistent volume (PV) manifest:

apiVersion: v1
kind: PersistentVolume
metadata:
name: train-pv01
labels:
type: training
dept: ds
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteMany
csi:
driver: spectrumscale.csi.ibm.com
volumeHandle: "835838342966509310;099B6A7A:5EB99721;path=/mnt/fs1/training-data"

The PVs use the labels type: training and dept: ds so a PVC from a user can claim a specific volume from the all available PVs characterized specifically by these labels. The labels should characterize the data behind the PV and its backing directory.

Any user can now claim a PV from this pool and ensure to get a PV with the data that the user is actually interested in by using a selector with the respective labels in a corresponding persistent volume claim (PVC) as follows:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: training-data-pvc
spec:
storageClassName: ""
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
selector:
matchLabels:
type: training
dept: ds

We explicitly use an empty storageClassName: "" here in order to ensure that we always skip dynamic provisioning with a default storage class (in case a default storage class is present). Otherwise the PVC will get stuck in a pending state when the default storage class is invoked with a selector in the manifest:

Events:
Type Reason Message
------- ------ -------
Warning ProvisioningFailed failed to provision volume with StorageClass "ibm-spectrum-scale-light-sc": claim Selector is not supported

A PVC with a non-empty selector is not supposed to have a PV dynamically provisioned for it (see Persistent Volumes). The PVC above will be bound to any available PV from the pool of pre-provisioned PVs that is available and matches the specific labels given under matchLabels in the selector section. Although the requested storage capacity is ignored, access modes are still taken into account in the matching criteria.

Once the PV is bound to the user's PVC it can be used in a pod in the same way as provided in the pod example above (in the Dynamic volume provisioning section) simply by referencing the PVC name in the volumes section of the pod manifest. The PV is bound to the PVC in the user's namespace and no longer available to other users in other namespaces. Other users in other namespaces can issue an identical PVC request and will bind to another PV from the pool matching the requested labels (if any such PV is available). So if multiple users in different namespaces need access to the same data directory in IBM Spectrum Scale, the admin would need to create multiple identical PVs with the same labels and same volumeHandle but each with its own unique PV name.

IMPORTANT: Please also refer to the section SELinux and uid/gid context below for additional important implementation considerations when sharing access to the same data in IBM Spectrum Scale across namespaces in OpenShift!

A variant of the above approach is that the cluster admin can also create static PVs associated with a fake (i.e. a non-existent) storageClassName of their own, such as a "static" as we use in the example below:

apiVersion: v1
kind: PersistentVolume
metadata:
name: train-pv01
labels:
type: training
dept: ds
spec:
storageClassName: static
capacity:
storage: 1Gi
accessModes:
- ReadWriteMany
csi:
driver: spectrumscale.csi.ibm.com
volumeHandle: "835838342966509310;099B6A7A:5EB99721;path=/mnt/fs1/training-data"
In addition to the labels type: training and dept: ds the user's PVC request now also would need to reference this specific storageClassName with storageClassName: static rather than referencing the empty ("") storage class (and, of course, also skip dynamic provisioning with the default storage class):
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: training-data-pvc
spec:
storageClassName: static
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
selector:
matchLabels:
type: training
dept: ds

This would improve overall volume management as all statically provisioned PVs can now - in addition to the labels - also be associated and grouped with their own storage class (although there is no provisioner and no real storage class associated with it) - similar to an annotation or another label. The OpenShift documentation provides a similar example for using such a storageClassName in a manually created PV in Persistent storage using hostPath.

Note that the chosen storageClassName for the static PVs must be different from any existing real storage class that is or will be present in the OpenShift cluster!
Preventing access by namespace

Associating the created PVs with their own storageClassName allows us to make use of Storage Resource Quota in order to limit access to these statically provisioned PVs based on their associated storageClassName. For example, by applying the following ResourceQuota manifest with an allowed maximum number of 0 (zero) persistent volume claims from the storage class static we would ensure that a user in the namespace dean cannot claim any PVs associated with the storageClassName static:

apiVersion: v1
kind: ResourceQuota
metadata:
name: storage-quota
namespace: dean
spec:
hard:
static.storageclass.storage.k8s.io/persistentvolumeclaims: 0
Claiming any PVs from the pool which are associated with the storage class static is now prevented for any PVC request in the namespace dean:
Error from server (Forbidden): error when creating "pvc01.yaml": persistentvolumeclaims "pvc01" is forbidden: exceeded quota: storage-quota, requested: static.storageclass.storage.k8s.io/persistentvolumeclaims=1, used: static.storageclass.storage.k8s.io/persistentvolumeclaims=0, limited: static.storageclass.storage.k8s.io/persistentvolumeclaims=0

For more information about the use of storage resource quota and the available options please refer to the respective Storage Resource Quota section below.

Enforcing read-only access

In other cases you may, for example, want to ensure that multiple users can access the shared data for read but shall not be able to write or change the data. In this case you might think of the admin creating the PVs with the additional readOnly CSI option as described in Kubernetes Volumes - Out-of-tree volume plugins - CSI:

apiVersion: v1
kind: PersistentVolume
spec:
  [...]
  csi:
    driver: spectrumscale.csi.ibm.com
    volumeHandle: "ClusterId;FSID;path=/gpfs/fs0/data"
    readOnly: true

Unfortunately, the CSI readOnly flag does not seem to be properly honored as of today by the Container Storage Interface (CSI) and therefore it is not yet implemented in IBM Spectrum Scale CSI Driver. See Kubernetes issues 61008 and 70505.

Of course, a user can always use the readOnly flag in the volumeMounts section of the pod manifest to ensure that a volume is mounted in read-only mode inside the container to prevent any changes to the data in that volume:

  spec:
containers:
- name: ibm-spectrum-scale-test-pod
image: registry.access.redhat.com/ubi8/ubi-minimal:latest
volumeMounts:
- name: vol1
mountPath: "/data"
readOnly: true

However, this is no option for an admin to generally protect the shared data from any changes by common users as it would be up to the individual user to actually honor the request to mount the shared volume in read-only mode. It might be an option for selected automated workloads started by admins or other privileged users.

Other options to ensure read-only access on shared data in IBM Spectrum Scale would be to carefully work with file permissions (uid, gid, mode bits), ACLs or the immutability options in IBM Spectrum Scale as described in Immutability and appendOnly features or Creating immutable filesets and files.
For example, the storage admin could set the immutable flag in IBM Spectrum Scale on file level by running

# mmchattr -i yes my_immutable_file

# mmlsattr -L my_immutable_file
file name: my_immutable_file
metadata replication: 1 max 2
data replication: 1 max 2
immutable: yes
appendOnly: no
flags:
storage pool name: system
fileset name: root
snapshot name:
creation time: Tue Apr 12 18:09:10 2022
Misc attributes: ARCHIVE READONLY
Encrypted: no

# ls -alZ my_immutable_file
-rw-rw-rw-. 1 1000680000 root system_u:object_r:container_file_t:s0:c15,c26 10133 Apr 14 10:24 my_immutable_file

Setting the immutable flag in IBM Spectrum Scale would prevent any changes to the file when accessed from containerized applications in OpenShift:

$ oc rsh test-pod
sh-4.4$ id
uid=1000680000(1000680000) gid=0(root) groups=0(root),1000680000

sh-4.4$ ls -alZ /data/my_file.out
-rw-rw-rw-. 1 1000680000 root system_u:object_r:container_file_t:s0:c15,c26 10133 Apr 14 10:24 my_immutable_file

sh-4.4$ echo xxxxxx >> /data/my_immutable_file
sh: /data/my_file.out: Read-only file system

Depending on when the immutable flag was applied, i.e. before or after the PV was mounted in a container, the error message in the container on an attempt to write to an immutable file shows either "Read-only file system" or "Permission denied", respectively.

Advanced static volume provisioning using claimRef

In this section we look at specific directories in IBM Spectrum Scale that we want to make available only to specific users or, more precisely, to specific namespaces. So we need to make sure that these PVs can only be bound to the specific namespace and not be claimed by any other user in any other namespace.

Instead of using labels as before we will now make use the claimRef option as described in Reserving a PersistentVolume. By using ClaimRef we can declare (and enforce) a bi-directional binding between the statically provisioned PV and a PVC based on the PVC name and its originating namespace. Therefore, we also do not need to make use of ResourceQuota to control which namespace can or cannot consume the static PVs. However, the user should always refer to an empty storage class (storageClassName: "") in the PVC in order to ensure dynamic provisioning with the default storage class (should one be present in the OpenShift cluster) is skipped. Although the volume binding with claimRef happens regardless of other volume matching criteria (including the specified storage class in the PVC) it is a good idea for the user to always reference an empty storage class (storageClassName: "") in the persistent volume claim (PVC) to ensure that the PVC will not accidentally bind to a freshly provisioned volume from the default storage class in case the static PV has not yet been provisioned by the admin or in case of a mismatch/typo in the PVC name.

Let's assume we want to provide access to a specific directory in IBM Spectrum Scale with confidential business data only to the financial department which runs their analytic applications in a specific namespace called finance in OpenShift. On request, the cluster admin would prepare a persistent volume (PV) with the following manifest:

# cat create-pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: business-pv01
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteMany
claimRef:
name: business-data-pvc
namespace: finance
csi:
driver: spectrumscale.csi.ibm.com
volumeHandle: "835838342966509310;099B6A7A:5EB99721;path=/mnt/fs1/business-data"

A user in the finance namespace can now easily claim the pre-provisioned persistent volume (PV) through a persistent volume claim (PVC) by using the exact same PVC name business-data-pvc as specified in the claimRef section:

# cat create-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: business-data-pvc
spec:
storageClassName: ""
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi

By using claimRef we can declare a 1:1 bidirectional binding between a PV and a PVC and make sure that this PV will only bind to a PVC request in the finance namespace with the PVC name business-data-pvc. This way we can safely provide access to the confidential directory /mnt/fs1/business-data in IBM Spectrum Scale to the selected namespace. The binding with claimRef happens regardless of other volume matching criteria. However, the OpenShift/Kubernets control plane still checks that storage class, access modes, and requested storage size are valid.

The cluster admin can also issue the PV creation request (oc apply -f create-pv.yaml) immediately followed by the PVC request (oc apply -f create-pvc.yaml -n finance) and verify the proper binding (oc get pvc -n finance) instead of waiting for the user to issue the PVC request and complete the binding.

Only by using claimRef can we ensure that the created PV is bound to the corresponding persistent volume claim (PVC) and namespace and that is not bound to any other pending PVC request from other users that would meet the volume matching criteria.

In case the finance department in our example has multiple namespaces where access to the same data is needed, the cluster administrator could prepare another PV with a similar manifest but with a different PV name (business-pv02) and a different target namespace in the claimRef section. In this case we would share access to the same data across more than one namespace and so the cluster admin would have to create multiple PVs, i.e one PV per target namespace as each PV binds to a PVC from another namespace. Note that only one PV per namespace is needed as the PVC can be used in all pods within the same namespace.

IMPORTANT: Please also refer to the section SELinux and uid/gid context below for additional important implementation considerations when sharing access to the same data in IBM Spectrum Scale across namespaces in OpenShift!

Storage Resource Quota

Storage resource quota allow to limit the total sum of storage resources that can be consumed in a given namespace including the number of persistent volume claims (PVCs). The consumption of these storage resources can even be limited selectively based on associated storage classes which, of course, primarily aims at dynamic provisioning. However, as we did show in the Preventing access by namespace paragraph above with static PVs using labels, this can be used to exclude selected user namespaces from being able to claim any statically provisioned PVs associated with a specific storageClassName (i.e. a nonexisting "fake" storage class). Resource quota would not need to be applied with static PVs using claimRef which already ensures a 1:1 binding to a specific namespace and persistent volume claim (PVC) only.

Here is a more general example of setting storage resource quota for the namespace dean:

apiVersion: v1
kind: ResourceQuota
metadata:
name: storage-quota
namespace: dean
spec:
hard:
requests.storage: 500Gi
persistentvolumeclaims: 10
static.storageclass.storage.k8s.io/persistentvolumeclaims: 0
ibm-spectrum-scale-sc.storageclass.storage.k8s.io/requests.storage: 100Gi
ibm-spectrum-scale-sc.storageclass.storage.k8s.io/persistentvolumeclaims: 5

The user in the namespace "dean" cannot claim any PVs from the static PVs associated with the storage class static, can only have a maximum of 10 persistent volumes claims (PVCs) and a maximum claimed storage capacity of 500Gi in total. Furthermore the user can only consume a maximum of 5 persistent volume claims (PVCs) and a maximum storage capacity of 100Gi from the dynamic storage class ibm-spectrum-scale-sc. For more details about storage resource quota please refer to Storage Resource Quota.

Summary of volume provisioning use cases

Typical use cases for the discussed volume provisioning methodologies above may be summarized as follows:

  • Dynamic provisioning using storage classes is the default method for providing new volumes to users on demand. The persistent volume (PV) is automatically provisioned and bound to the persistent volume claim (PVC) of a user in a self-service fashion. The cluster admin only needs to create the storage class(es).
  • Basic static provisioning (without using labels or claimRef) is generally not the preferred way of providing new volumes to users in OpenShift as this can more conveniently be done with dynamic provisioning. It also lacks control of which pre-provisioned PV will be bound to which PVC request or namespace. It can be used, however, if an admin needs to use dedicated paths and directories in IBM Spectrum Scale for persistent volumes (PVs) in OpenShift. In this case the admin would need to manually provision new directories in IBM Spectrum Scale and create the related PVs - each backed by its own empty directory. Typically the admin would provide a pool of PVs backed by empty directories in IBM Spectrum Scale so that users can claim these PVs through persistent volume claims (PVCs) on demand. The binding of a PVC to a PV is based on a best match with regard to requested storage size and access mode. The admin has no control which PVC will bind to which PV. A PV will be bound to any PVC from any namespace that meets volume matching criteria. Therefore it is only an option for empty directories when providing new volumes to users. Using dynamic provisioning with a storage class for creating lightweight volumes might be an alternative to consider here.
  • Static provisioning with labels (storageClassName: "") can be used to provide shared access to existing data in IBM Spectrum Scale to multiple users on demand. The labels characterize the data behind the PVs so that a user can selectively request different static PVs representing access to different data in IBM Spectrum Scale through a regular persistent volume claim (PVC) with the proper selector. A PV is bound to the PVC based on its requested selector labels and access modes. The static PVs need to be created manually by the cluster admin in advance. Typically the admin will create a pool of pre-provisioned PVs so that multiple users can claim a PV on demand. If the data needs to accessed in multiple namespaces then the admin would need to create at least one PV per target namespace. Within a given namespace a PVC (exclusively bound to a PV) can be used in multiple pods across nodes (RWX - ReadWriteMany access mode). In order to ensure that we always skip dynamic provisioning with the default storage class (should one be present) the user needs to refer to an empty storageClassName: "" in the PVC manifest.
  • Static provisioning with labels (storageClassName: "static") can be further enhanced by associating the statically provisioned PVs with a "fake" storageClassName in their manifests, e.g. storageClassName: static. Instead of referring to an empty storageClassName: "" in the persistent volume claim (PVC) the user can now simply refer to a storage class like "static" even for statically provisioned PVs. This improves overall volume management as all statically provisioned PVs can now be associated with their own storage class (although there is no provisioner and no real storage class related to it). A PV is bound to a PVC based on its label selector, storage class and access modes. A huge advantage is that it allows to make use of ResourceQuota in order to control which namespace can actually claim statically provisioned PVs associated with a specific storageClassName. Note that the chosen storageClassName for the static PVs must be different from any existing real storage class that is or will be present in the OpenShift cluster!
  • Static provisioning with claimRef can be used to provide shared access to specifc data in IBM Spectrum Scale to specific users or, more precisely, user namespaces in a controlled fashion. With claimRef we can ensure a 1:1 binding between a statically provisioned PV and a persistent volume claim (PVC) from a selected namespace. Only the PVC with the name and namespace as specified in the claimRef section of the PV manifest can bind to the PV. Here we do not need to make use of ResourceQuota to control which namespace can or cannot consume the static PVs. Only by using claimRef can we make sure that a specific statically provisioned PV is actually bound to a specific PVC of a specific user namespace. In all other static volume provisioning cases without claimRef any (even pending) persistent volume claim (PVC) from any user in any namespace that meets the volume matching criteria can bind to the PV once it is created by the admin. Although the volume binding with claimRef happens regardless of other volume matching criteria (including the specified storage class in the PVC) it is a good idea for the user to always reference an empty storage class (storageClassName: "") in the persistent volume claim (PVC) to ensure that the PVC will not accidentally bind to a freshly provisioned volume from the default storage class in case the static PV has not yet been provisioned by the admin or in case of a mismatch/typo in the PVC name. The static PV has to be created manually by the cluster admin in advance. If the data needs to accessed in multiple namespaces then the admin would need to create at least one PV per target namespace. Within a given namespace a PVC (exclusively bound to a PV) can be used in multiple pods across nodes (RWX - ReadWriteMany).
In any case when data access to the same data and location in IBM Spectrum Scale is shared across user namespaces then SELinux and uid/gid restrictions apply as discussed in the SELinux and uid/gid context section below.

SELinux and uid/gid context

A user's workload in OpenShift is running in the context of a pod and its containers which are bound to a namespace. Data access to IBM Spectrum Scale is provided through a persistent volume (PV) which binds to a persistent volume claim (PVC) that is bound to a user's namespace. OpenShift and Kubernetes make use of namespaces to isolate resources (see Namespaces) while a file system like IBM Spectrum scale relies on file permissions and ACLs to control user access to data.

A common user in OpenShift is not typically associated with a managed user ID (uid) or group ID (gid) that can easily be correlated to the uid and gid in the IBM Spectrum file system. Typically, an OpenShift user is assigned an arbitrary uid from a pre-defined range when running workloads and accessing data in the associated pods. Deployed applications in OpenShift may run under some general uid/gid settings which are defined in the container image or the securityContext of the pod or container (see, Configure a Security Context for a Pod or Container with runAsUser, runAsGroup, fsGroup or fsGroupChangePolicy) depending on the privileges of the associated user or service account.

Furthermore, OpenShift users in different namespaces even run under different SELinux contexts. Therefore, in order to take full advantage of the advanced features that a parallel file system like IBM Spectrum Scale has to offer (e.g. simultaneous file access from multiple nodes / RWX access mode, global namespace for all the data across different platforms and protocols, Active File Management/AFM, data protection and disaster recovery, etc.), special considerations apply when managing file permissions and SELinux context for sharing data access across different user namespaces in OpenShift.

OpenShift applies strict security standards. Users interacting with Red Hat OpenShift Container Platform (OCP) authenticate through configurable identity providers (for example, HTPasswd or LDAP). Through role-based access control (RBAC) objects like rules, roles, and role bindings OpenShift determines whether a user is authorized to perform an action within a namespace (or project). For more information, see Understanding identity provider configuration.

In addition, Security Context Constraints (SCCs) define a set of conditions which a pod must comply with. Pods eventually run the user's workload, and SCCs control the permissions for these pods and determine the actions that the pods (and their collections of containers) can perform. SCCs are composed of settings and strategies that control the security features that a pod can access. For more information, see Managing Security Context Constraints.

When sharing data access in IBM Spectrum Scale with users in OpenShift the cluster admin has to ensure that

  1. the file permissions of the PV's backing directory in IBM Spectrum Scale with its user (uid), group (gid) and mode bits (rwxrwxrwx) settings - and potentially configured ACLs - grant all intended OpenShift users and applications proper access to the intended files and nested sub-directories,

  2. the same SELinux context is applied to all user pods in OpenShift accessing the same backing directory in IBM Spectrum Scale through multiple PVs.

Without further customization, a common user in OpenShift runs under the restricted SCC (Security Context Constraints) while a privileged user like a cluster admin runs under the privileged SCC:

# oc get scc restricted privileged
NAME PRIV CAPS SELINUX RUNASUSER FSGROUP SUPGROUP PRIORITY READONLYROOTFS VOLUMES
restricted false <no value> MustRunAs MustRunAsRange MustRunAs RunAsAny <no value> false ["configMap","downwardAPI","emptyDir","persistentVolumeClaim","projected","secret"]
privileged true ["*"] RunAsAny RunAsAny RunAsAny RunAsAny <no value> false ["*"] 

This means a common user in OpenShift under the restricted SCC (runAsUser: MustRunAsRange) is assigned an arbitrary uid, e.g. 1000680000, from a pre-defined range, with a gid 0 (root). This OpenShift assigned uid even takes precedence over the USER defined in the container image (i.e. Dockerfile). So the uid/gid in OpenShift will not typically match the uid/gid of the user in the IBM Spectrum Scale file system.
Of course, one could imagine to apply a more granular set of SCCs down to individual users or groups or selected service accounts in order to enforce a specific uid, gid and SELinux label but this would require very careful planning and testing to make sure it does not break other applications or OpenShift drivers that may implicitely rely on these defaults (e.g. volume provisioning with CSI typically relies on an arbitrary uid with gid 0 to grant proper access to the data in a PV).

So, with regard to file permissions, the storage admin would need to ensure that the uid, gid and mode bits (chmod) of each file and directory in the shared backing directory of a PV in IBM Spectrum Scale (plus potentially configured ACLs) would allow proper access to all intended OpenShift users who may run under arbitrary uids 1xxxxxxxxx and gid 0 depending on the associated SCCs.

The SELinux context is applied through recursively relabeling all of the files and nested sub-directories in the backing directory of a PV in the moment a pod is started and actively mounting the PVC/PV. This is very important to understand! The SELinux relabeling does not happen when a PV is bound to a PVC. It happens as soon as a pod consuming this PVC is started. If pods from different user namespaces with different SELinux contexts are started and are accessing the same data on the backend in IBM Spectrum Scale, then the last pod started wins and maintains access to the data in the shared backing directory of the PVs and the first pods are locked out from all access to the data. This loss of data access happens almost undetected as it will only be seen in the SELinux logs on the OpenShift worker node where the pod is running (e.g. running as core user directly on the worker node and using commands like sudo aureport -a or sudo ausearch -m avc).

Beyond the options to properly apply customized SCCs and/or create special service accounts in the associated user namespaces, SELinux labels can also be configured on namespaces and in the securityContext of an individual pod or container.
A restricted user cannot simply change the default SELinux context of the namespace nor set the SELinux label in the securityContext of a pod or container to anything else than the user's associated default label. However, a cluster admin or privileged user can either edit the SELinux label of a user namespace or even start a specific pod in a given user namespace with a selected SELinux label specified in the pod's securityContext. This offers two simple ways to implement shared data access across namespaces for OpenShift users or applications without the need to create customized SCCs and/or service accounts (which would be preferred option):

(1) Set SELinux context on a namespace

If volumes backed by the same data location in IBM Spectrum Scale will be shared across user namespaces, one option would be to have the cluster admin set the SELinux annotation of the involved namespaces in OpenShift to the same SELinux context as shown below (e.g., using oc edit ns NAMESPACE-NAME):

apiVersion: v1
kind: Namespace
metadata:
  annotations:
    openshift.io/sa.scc.mcs: s0:c26,c0
openshift.io/sa.scc.supplemental-groups: 1000670000/10000 openshift.io/sa.scc.uid-range: 1000670000/10000 [...]

If users are allowed to share access to the same data then there might be a situation of trust where such an exception of not enforcing different SELinux labels per namespace might be a good enough solution. By doing so users in these namespaces will apply the same SELinux context and can easily share access to the same data in IBM Spectrum Scale through statically provisioned PVs. (i.e. PVs backed by the same directory in IBM Spectrum Scale).

In the example below, data in IBM Spectrum Scale is accessed through a static PV from pods running in namespace #1. These pods apply the default SELinux context as defined in the annotation of namespace #1 (openshift.io/sa.scc.mcs: s0:c26,c0). If we plan for another user in a second namespace who will need to run own pods with access to the same data, then a cluster admin can initially set the SELinux annotation of the second namespace to the same SELinux context of the first namespace (openshift.io/sa.scc.mcs: s0:c26,c0). This ensures that the same default SELinux context is applied in both namespaces and that shared access to the same data directory in IBM Spectrum Scale (/mnt/fs1/shared-data) through statically provisioned PVs from different namespaces in OpenShift is not prevented by SELinux.

(2) Set SELinux context in the securityContext of a pod or container

The second option to run pods in different user namespaces (each with a different SELinux context), where the pods need shared access to the same backing directory in IBM Spectrum Scale would be to have a privileged user run (or schedule) the pod with the properly defined SELinux labels in the securityContext section of the pod manifest as shown below (see Assign SELinux labels to a Container):

kind: Pod
apiVersion: v1
metadata:
  name: mypod
spec:
  containers:
    - name: test-pod
      image: ubi8/ubi-minimal:latest
securityContext:
seLinuxOptions:
level: "s0:c26,c0"
[...]

This might be a good option for automated or scheduled workloads (jobs) in OpenShift which are initiated by a privileged user. If only read access to the shared data is required for the job then the privileged user can also make use of the readOnly option in the volumeMount section of the pod manifest to prevent any changes to the shared data directory.

In the example below, data in IBM Spectrum Scale is already accessed through a static PV from pods running in namespace #1. These pods apply the default SELinux context as defined in the annotation of namespace #1 (openshift.io/sa.scc.mcs: s0:c26,c0). If we need to run another pod which requires access to the same data in a second namespace which applies a different (default) SELinux context, here namespace #2 (openshift.io/sa.scc.mcs: s0:c27,c19), then a privileged user (or a cluster admin) can start the pod in namespace #2 and define the proper SELinux context in the container's securityContext (level: "s0:c26,c0"). This ensures that the same SELinux context is applied and shared access to the same data directory in IBM Spectrum Scale (/mnt/fs1/shared-data) is not prevented by SELinux. This requires a privileged user to run the pod.

References

​​​​​
3 comments
55 views

Permalink

Comments

10 days ago

Hallo @GERO SCHMIDT,

thanks for the clarification. The fileset mapping is now understand. 

As a request to the scale people the man page an the short-cmd list should be updated to explain the --uid option. On scale 5.1.3.1 this option are currently not documented.

Regards Renar​

11 days ago

Hi @renar Grunenberg,
thanks for the feedback. To answer your questions:

(1) IBM Spectrum Scale CSI Driver v2.5.0 introduced a new storage class for creating consistency group volumes (see Storage class for creating consistency group volumes). This storage class allows to create a consistent point-in-time snapshot of all PVs created from that storage class. It is aimed at applications where a consistent snapshot of multiple PVs is needed (not just a snapshot of an individual PV). By  placing all PVs from a consistency group storage class (each PV is created as a dependent fileset) into one independent fileset (as the root fileset to host all these independent filesets) we can use snapshots on the independent fileset to create a consistent point-in-time snapshot of all the nested dependent filesets (each backing a PV from the storage class).

(i) A consistency group is mapped to an independent fileset.
(ii) A volume (PV) in a consistency group is mapped to dependent fileset. within the independent fileset.

(2) The mm-command to retrieve the filesystem ID is indeed a little buried in the article. You can find it in paragraph (2) in the CSI volumeHandle section):

(2) The second parameter that we need is the UID of the IBM Spectrum Scale file system [...] We can obtain the UID [...] by executing the mmlsfs command as follows:
# oc exec worker1a -n ibm-spectrum-scale -- mmlsfs fs1 --uid

So you can obtain the filesystem ID simply by running:

mmlsfs fs1 --uid

with fs1 being the IBM Spectrum Scale filesystem name. The ID is the same if you run the command on the remote storage cluster on the original file system name (e.g. essfs1) or on the local CNSA cluster on the local file system name for the remote mount (e.g. fs1 as local name for the mounted  remote file system essfs1). 

​​​​​​​

19 days ago

Hallo Gero,

great article. Can you a little bit clarify two points here?

1. dependent fileset jointly embedded in independent Files ??

  • consistency group volumes (backed by dependent filesets jointly embedded in an independent fileset for consistent snapshots of all contained volumes).

2. Is your mentionioned filesystemid the stripegroupID? And were can I  found this ID with a mm cmd? (I know the mmfsadm dump fs cmd)

Regards Renar