Cloud Pak for Data

 View Only

Introducing the use of Spectrum Scale storage class on CP4D environment.

By Guo Liang Wang posted Thu September 01, 2022 10:44 PM

  

Introduction

More and more companies are realizing the importance of storage security and convenience. For storage on cloud, the traditional storage could be no longer meet the current application scenarios, as Spectrum Scale is a clustered file system that provides concurrent access to a single file system or set of file systems from multiple nodes. The nodes can be SAN attached, network attached, a mixture of SAN attached, and network attached, or in a shared-nothing cluster configuration. This enables high performance access to this common set of data to support a scale-out solution or to provide a high availability platform. It has a variety of features that include data replication, policy-based storage management, and multi-site operations in addition to common data access. There are also volume snapshots and snap mirrors, and base on CSI snapshots. Spectrum Scale container native allows the deployment of the cluster file system in a Red Hat, OpenShift cluster. Using a remote mount attached file system, the container native deployment provides a persistent data store to be accessed by the applications through the Spectrum Scale CSI driver by using Persistent Volumes (PVs).

Architecture of Spectrum Scale storage on CPD

    Figure 1-1

Detailed description of the layout of the figure

  • The bottom one is OpenShift Cluster, on top of the OCP cluster is Spectrum Scale Container Native Storage Access (CNSA), the top half is the upcoming CP4D4.5 product, and the right half is the GPFS cluster with NSD.
  • For CNSA, it consists of three main namespaces, namely Spectrum Scale, CSI, and operator. We deploy all components of Spectrum-Scale to OCP Cluster by means of Operator.
  • It will eventually start a worker POD on each Worker node of our OCP and use daemon pod to mount the file system on the remote storage on the right side of the figure to the worker node of OpenShift Cluster, thereby obtaining Spectrum Scale storage space. For our Cloud Pak for Data 4.5, Services creates Customer Resource through Storage Class, which is also referred to as CR.
  • Our Spectrum Scale-Storage Class first calls Restful API through CSI to create Independent File set in remote storage (we also call Restful API through CSI).
  • It can be understood as a volume), the remote storage is then returned to Spectrum Scale-CSI through the API, and the CSI passes some operations, such as band mount, according to the PVC applied by the CPD Service, mount the pv to the corresponding path of our worker node to complete a PVC the bonding. Provides CP4D services for use and data access.


Storage Interface (CSI), which doesn’t cause notable IO suspension during volume snapshotting of a CP4D backup process. Below are its key capabilities:

  • Volume Snapshot to support non-disruptive backup 

  • Standard CSI implementation for PV usage 

  • Storage online expansion without impact on Data


Configure Storage Class by Spectrum Scale on CP4D environment.

INDEX

  1. [Install documentation](#PART 1 Installation documentation)
  2. [PREREQUISITE STEPS](#PART 2 PREREQUISITE STEPS FOR Spectrum Scale SETUP)
  3. [SETUP STORAGE OF REMOTE](#PART 3 Manually installing the IBM Spectrum Scale software)
  4. [Build SC](#PART 4 Installing the Spectrum Scale Container Native

PART 1 Installation documentation

  1. Red Hat Open Shift Container Platform configuration - Applying MCO - link
  2. Adding IBM Cloud Container Registry credentials - Use job - link
  3. Installing the IBM Spectrum Scale container native operator and cluster
  4. Deploy the operator - Use command - link
  5. Configuring the IBM Spectrum Scale container native cluster custom resources
  6. Creating the IBM Spectrum® Scale container native cluster - Use command to apply the YAML file - link
  7. Creating secrets for storage cluster GUI - Use command - link
  8. Verifying the IBM Spectrum Scale container native cluster - Complete steps - link

PART 2 Installing the remote cluster of GPFS on Redhat8.x:

1. Download package from Fix Central

  • Fix Central link
  • gpfs.gui-5.1.1-3.noarch
  • gpfs.gss.pmcollector-5.1.1-3.el8.x86_64
  • gpfs.java-5.1.1-3.x86_64
  • gpfs.gpl-5.1.1-3.noarch
  • gpfs.gss.pmsensors-5.1.1-3.el8.x86_64
  • gpfs.msg.en_US-5.1.1-3.noarch
  • gpfs.compression-5.1.1-3.x86_64
  • gpfs.adv-5.1.1-3.x86_64
  • gpfs.license.dm-5.1.1-3.x86_64
  • gpfs.crypto-5.1.1-3.x86_64
  • gpfs.gskit-8.0.55-19.x86_64
  • gpfs.docs-5.1.1-3.noarch
  • gpfs.base-5.1.1-3.x86_64
  • gpfs.afm.cos-1.0.0-3.x86_64

2. Installation of packages for Rhel8.x

./Spectrum_Scale_Advanced-5.x.x.x-x86_64-Linux-install (note: x.x.x->1.1.3 or 1.2.1 and so on)
  Extracting License Acceptance Process Tool to /usr/lpp/mmfs/5.1.2.1 ...
 tail -n +648 ./Spectrum_Scale_Advanced-5.1.2.1-x86_64-Linux-install | tar -C /usr/lpp/mmfs/5.1.2.1 -xvz --exclude=installer   --exclude=*_rpms --exclude=*_debs --exclude=*rpm  --exclude=*tgz --exclude=*deb --exclude=*tools* 1> /dev/null
Program Name (Program Number):
IBM Spectrum Scale Standard Edition 5.1.2.1 (5737-F33)
IBM Spectrum Scale Advanced Edition 5.1.2.1 (5737-F35)

The following standard terms apply to Licensee's use of the 

Press Enter to continue viewing the license agreement, or 
enter "1" to accept the agreement, "2" to decline it, "3" 
to print it, "4" to read non-IBM terms, or "99" to go back 
to the previous screen.
1
License Agreement Terms accepted.
Extracting Product RPMs to /usr/lpp/mmfs/5.1.2.1 ...


3. issue the following command to install package for Advanced Edition on each node:

#cd /usr/lpp/mmfs/5.1.1.3/zimon_rpms/rhel8
#cp -rp gpfs.gss* ../../gpfs_rpms/
#cd ../../gpfs_rpms/
#yum install *.rpm
#/usr/lpp/mmfs/bin/mmbuildgpl


4. Create the cluster by using the following command from one of manager nodes:

In this command example, NodesList is a file that contains a list of nodes and node designations to be added to the cluster and its contents are as follows:(nodefile)
sre-spectrum-1.fyre.ibm.com:quorum
sre-spectrum-2.fyre.ibm.com:quorum
sre-spectrum-3.fyre.ibm.com:quorum

5.Create the cluster by using the following command from one of the nodes
#mmcrcluster -N nodefile -p g1 -s g2 -C gpfs2  -A -r /usr/bin/ssh -R /usr/bin/scp


6. Accept proper role licenses for the nodes by using the following commands from one of the nodes.

#mmchlicense server -N all


7. IBM Spectrum Scale cluster is now created. You can view the configuration information of the cluster by using the following command.

# /usr/lpp/mmfs/bin/mmlscluster

GPFS cluster information
========================
  GPFS cluster name:         sre-spectrum-1.fyre.ibm.com
  GPFS cluster id:           2133347628076312447
  GPFS UID domain:           sre-spectrum-1.fyre.ibm.com
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp
  Repository type:           CCR

 Node  Daemon node name             IP address    Admin node name              Designation
-------------------------------------------------------------------------------------------
   1   sre-spectrum-1.fyre.ibm.com  10.11.71.223  sre-spectrum-1.fyre.ibm.com  quorum
   2   sre-spectrum-2.fyre.ibm.com  10.11.71.243  sre-spectrum-2.fyre.ibm.com  quorum
   3   sre-spectrum-3.fyre.ibm.com  10.11.72.30   sre-spectrum-3.fyre.ibm.com  quorum


8. Start the GPFS daemons and the cluster by using the following command from one of the nodes.

#/usr/lpp/mmfs/bin/mmstartup -a 
#/usr/lpp/mmfs/bin/mmgetstate -a

 Node number  Node name        GPFS state  
-------------------------------------------
       1      sre-spectrum-1   active
       2      sre-spectrum-2   active
       3      sre-spectrum-3   active


9. Create NSDs as follows.

%pool:
    pool=system
    blockSize=4M
    layoutMap=cluster
    usage=dataAndMetadata
    allowWriteAffinity=no
   %nsd:
       servers=sre-spectrum-1
       nsd=01_vdb
       usage=dataAndMetadata
       device=/dev/vdb
       pool=system
   %nsd:
       servers=sre-spectrum-2
       nsd=01_vdb
       usage=dataAndMetadata
       device=/dev/vdb
       pool=system
   %nsd:
       servers=sre-spectrum-3
       nsd=02_vdb
       usage=dataAndMetadata
       device=/dev/vdb
       pool=system
....
Create the NSDs using the following command.(mmcrnsd –F NSD_Stanza_Filename) for example
#mmcrnsd -F nsd


10. Create a GPFS file system using the following command.

#mmcrfs gpfs1 -F gpfs1 -A yes -i 4096 -m 2 -M 3 -n 32 -r 2 -R 3
#mmmount gpfs1 -a
#df -hT |grep gpfs
gpfs1                 gpfs      1.2T   19G  1.2T   2% /gpfs1


11. Creating Operator User and Group

  • To enable gpfsgui.service and startup
#systemctl enable gpfsgui.service
Created symlink /etc/systemd/system/multi-user.target.wants/gpfsgui.service → /usr/lib/systemd/system/gpfsgui.service.
# systemctl start gpfsgui.service
# systemctl status gpfsgui.service
● gpfsgui.service - IBM_Spectrum_Scale Administration GUI
   Loaded: loaded (/usr/lib/systemd/system/gpfsgui.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2022-03-17 22:39:41 EDT; 45s ago
  Process: 199160 ExecStartPre=/usr/lpp/mmfs/gui/bin-sudo/cleanupdumps (code=exited, status=0/SUCCESS)
  Process: 198810 ExecStartPre=/usr/lpp/mmfs/gui/bin-sudo/check4sudoers (code=exited, status=0/SUCCESS)
  Process: 198756 ExecStartPre=/usr/lpp/mmfs/gui/bin-sudo/check4iptables (code=exited, status=0/SUCCESS)
  Process: 198555 ExecStartPre=/usr/lpp/mmfs/gui/bin-sudo/check4pgsql (code=exited, status=0/SUCCESS)
  Process: 198549 ExecStartPre=/usr/lpp/mmfs/gui/bin-sudo/update-environment (code=exited, status=0/SUCCESS)
 Main PID: 199172 (java)
   Status: "GSS/GPFS GUI started"
    Tasks: 136 (limit: 204321)
   Memory: 372.6M (limit: 2.0G)
   CGroup: /system.slice/gpfsgui.service
           ├─199172 /usr/lpp/mmfs/java/jre/bin/java -XX:+HeapDumpOnOutOfMemoryError -Dhttps.protocols=TLSv1.2,TL>
           └─219770 /bin/bash -c set -m; sudo mmlspolicy 'gpfs2' -L 
  • Creating Operator User and Group
  • To verify whether the IBM Spectrum Scale GUI user group ContainerOperator exists, enter the following command:
#/usr/lpp/mmfs/gui/cli/lsusergrp ContainerOperator
Name              ID Role              MFA
ContainerOperator 11 containeroperator FALSE
EFSSG1000I The command completed successfully.
  • To create the ContainerOperator GUI user group if it does not exist, enter the following command:
#/usr/lpp/mmfs/gui/cli/mkusergrp ContainerOperator --role containeroperator
EFSSP1102C CLI: The value "ContainerOperator" specified for "groupName" is already in use.
  • To verify whether an IBM Spectrum Scale GUI user exists within the ContainerOperator group, enter the following command:
#/usr/lpp/mmfs/gui/cli/lsuser | grep ContainerOperator
cnsa_storage_gui_user           active          ContainerOperator 0                     FALSE
  • To create the GUI user for the ContainerOperator group, enter the following command:
#/usr/lpp/mmfs/gui/cli/mkuser cnsa_storage_gui_user -p cnsa_storage_gui_password -g ContainerOperator
EFSSG0019I The user cnsa_storage_gui_user has been successfully created.
EFSSG1000I The command completed successfully.
  • By default, user passwords expire after 90 days. If the security policy of your organization permits it, then enter the following command to create the user with a password that never expires:
# /usr/lpp/mmfs/gui/cli/mkuser cnsa_storage_gui_user -p cnsa_storage_gui_password -g ContainerOperator -e 1
EFSSP1102C CLI: The value "cnsa_storage_gui_user" specified for "userID" is already in use.

PART 3 Installing the Scale Container Native Storage Access(CNSA) on OCP env.

1. Applying Machine Config Operator (MCO) Settings

Purpose:
The Machine Config Operator manages and applies configuration and updates of the base operating system and container runtime, including everything between the kernel and kubelet.

  • Increase pids_limit: Increase the pids_limit to 4096. Without this change, the GPFS daemon crashes during I/O by running out of PID resources.
  • Kernel Devel/Header Packages: Install the kernel related packages for IBM Spectrum Scale to successfully build its portability layer.
cat << EOF | oc apply -f -
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
  name: 01-worker-ibm-spectrum-scale-increase-pid-limit
spec:
  containerRuntimeConfig:
    pidsLimit: 4096
  machineConfigPoolSelector:
    matchLabels:
      pools.operator.machineconfiguration.openshift.io/worker: ""
---
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 00-worker-ibm-spectrum-scale-kernel-devel
spec:
  config:
    ignition:
      version: 3.2.0
  extensions:
  - kernel-devel
EOF

Check status Wait until updating is True

oc get mcp

[root@api.gyron.cp.fyre.ibm.com ~]# oc get mcp
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
master   rendered-master-8eec77e4719c2cd2505a92397b9c9207   True      False      False      3              3                   3                     0                      23h
worker   rendered-worker-2ade1601b3372c7bc0c29753953806e1   True      False      False      3              3                   3                     0                      23h

Verify that the pids_limit is increased on the worker nodes

 oc get nodes -lnode-role.kubernetes.io/worker= \
 -ojsonpath="{range .items[*]}{.metadata.name}{'\n'}" |\
 xargs -I{} oc debug node/{} -T -- chroot /host crio-status config | grep pids_limit

Validate that the Kernel-devel package is successfully

oc get nodes -lnode-role.kubernetes.io/worker= \
-ojsonpath="{range .items[*]}{.metadata.name}{'\n'}" |\
xargs -I{} oc debug node/{} -T -- chroot /host sh -c "rpm -q kernel-devel"

2. Add IBM Cloud container registry credentials(Pull secret)

  • Add cp.icr.io token to global pull secrets:
 BASE64_ENCODED_KEY=`echo -n "cp:$ENTITLEMENT_KEY" | base64 -w0`
        ENTITLEMENT_KEY="$ENTITLEMENT_KEY"
        #cat authority.json
        cat <<- EOF > authority.json
        {
          "auth": "$BASE64_ENCODED_KEY",
          "username":"cp",
          "password":"$ENTITLEMENT_KEY"
        }
        EOF
        oc get secret/pull-secret -n openshift-config -o json | \
        jq -r '.data[".dockerconfigjson"]' | \
        base64 -d - | \
        jq '.[]."cp.icr.io" += input' - authority.json > /tmp/temp_config.json
        oc set data secret/pull-secret -n openshift-config --from-file=.dockerconfigjson=/tmp/temp_config.json 

3. Creating secrets for storage cluster GUI #oc oc new-project ibm-spectrum-scale

#oc create secret generic cnsa-remote-mount-storage-cluster-1 --from-literal=username='cnsa_storage_gui_user' \
--from-literal=password='cnsa_storage_gui_password' -n ibm-spectrum-scale
#oc create secret generic csi-remote-mount-storage-cluster-1 --from-literal=username=csi-storage-gui-user --from-literal=password=csi-storage-gui-password -n ibm-spectrum-scale-csi
#oc label secret csi-remote-mount-storage-cluster-1 -n ibm-spectrum-scale-csi product=ibm-spectrum-scale-csi

4. Configuring the IBM Spectrum Scale container native cluster custom resources

GH_TOKEN="xxxxxxxx"
ClusterCR="raw.github.ibm.com/HCDP/DevOps/master/devopsscripts"
oc apply -f https://$GH_TOKEN@${ClusterCR}/spectrumscale/scale_v1beta1_cluster_cr.yaml

5. Configuring the IBM Spectrum Scale container native cluster custom resources

#oc project ibm-spectrum-scale
#oc get po
NAME                 
READY   STATUS    RESTARTS   AGE
ibm-spectrum-scale-gui-0           4/4     Running   4          10d
ibm-spectrum-scale-gui-1           4/4     Running   4          10d
ibm-spectrum-scale-pmcollector-0   2/2     Running   2          10d
ibm-spectrum-scale-pmcollector-1   2/2     Running   2          10d
worker0                            2/2     Running   0          10d
worker1                            2/2     Running   0          10d
worker2                            2/2     Running   0          10d

6. Check status of remote cluster

[root@api.sre-3m3w-spectrum-test-02.cp.fyre.ibm.com ~]# oc exec $(oc get pods -lapp.kubernetes.io/name=core -n ibm-spectrum-scale -o json | jq -r ".items[0].metadata.name") -- mmlscluster
Defaulted container "gpfs" out of: gpfs, logs, mmbuildgpl (init), config (init)

GPFS cluster information
========================
  GPFS cluster name:         ibm-spectrum-scale.sre-3m3w-spectrum-test-02.cp.fyre.ibm.com
  GPFS cluster id:           16309187935019360933
  GPFS UID domain:           ibm-spectrum-scale.sre-3m3w-spectrum-test-02.cp.fyre.ibm.com
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp
  Repository type:           CCR

 Node  Daemon node name  IP address    Admin node name  Designation
--------------------------------------------------------------------
   1   worker0           10.17.21.185  worker0          quorum-manager-perfmon
   2   worker1           10.17.23.210  worker1          quorum-manager-perfmon
   3   worker2           10.17.23.224  worker2          quorum-manager-perfmon

7. Check mounpiont of remote storage

[root@api.sre-3m3w-spectrum-test-02.cp.fyre.ibm.com ~]# oc exec $(oc get pods -lapp.kubernetes.io/name=core -n ibm-spectrum-scale -o json | jq -r ".items[0].metadata.name") -- df -Th |grep gpfs1
Defaulted container "gpfs" out of: gpfs, logs, mmbuildgpl (init), config (init)
gpfs1          gpfs     1.2T   18G  1.2T   2% /mnt/gpfs1

8. Check status of Deamon CR

# oc get daemon ibm-spectrum-scale -n ibm-spectrum-scale -o yaml |grep clusterID|awk 'END{print $2}'
"16309187935019360933"

9. Creating fileset based volumes of Storage class

cat << EOF | oc apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ibm-spectrum-scale-sc
provisioner: spectrumscale.csi.ibm.com
parameters:
  volBackendFs: gpfs1
  clusterId: "2133347628076312447" # cluster ID of storage cluster
reclaimPolicy: Delete
EOF

#oc get sc|grep ibm-spectrum-scale-sc
ibm-spectrum-scale-sc         spectrumscale.csi.ibm.com               Delete          Immediate              false 

10. Validate PVC bond by spectrum-scale-sc

oc apply -f - << EOF
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: test-claim
  namespace: default
  annotations:
    volume.beta.kubernetes.io/storage-class: "ibm-spectrum-scale-sc"
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Mi
EOF
#oc get pvc  -n default
NAME         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS            AGE
test-claim   Bound    pvc-7805b775-8c8c-4b26-b756-448bdf757671   1Gi        RWX            ibm-spectrum-scale-sc   9d

If the information is displayed(Bond) above, the SC is installed properly.




#CloudPakforDataGroup
#Highlights
#Highlights-home
0 comments
318 views

Permalink