File and Object Storage

Spectrum Scale NAS at home part 1: Building

By MAARTEN KREUGER posted 16 days ago

  

IBM Spectrum Scale is an enterprise Software Defined Storage product which means it is scalable, high performance, full featured, complex, and never seen in the home environment. But since last year there is a special version of Spectrum Scale which is available free for home or development use and contains all features but is capped at 12 TiB of storage.

So, why would you want to run an enterprise product like Spectrum Scale at home? Just use a Synology, QNAP, or some special Linux distribution like FreeNAS. But where is the fun in that? If you want total control over where your data is stored, and use enterprise level replication and cloud access features, this is your chance. Download at: https://www.ibm.com/products/spectrum-scale

You can install the software on any x86_64 Linux based system, but some distributions are better supported than others. RHEL/CentOS works best, followed by SLES, then Ubuntu. Some features do not work with some OS's, some only on certain levels of Spectrum Scale or the OS. It's complicated. Check table 13 and 17 at Q2.1 : https://www.ibm.com/docs/en/spectrum-scale?topic=STXKQY/gpfsclustersfaq.html

My test system is a simple NUC Running Ubuntu 20.04 LTS with an Intel Celeron and 4GB of RAM, built-in SSD and USB attached SATA drive. More RAM/CPU is always welcome, depending on how many and which features you're using. For production use we recommend at least 64GB RAM, or 256GB if you run a lot of services. Also, Spectrum Scale is a clustered filesystem so you can have like a thousand systems in your cluster, We'll just use one to start with.

Let's jump straight into the installation procedure, Unzip and extract the RPM+Debian Packages:

# unzip Spectrum_Scale_Developer-5.1.0.3-x86_64-Linux.zip

Archive:  Spectrum_Scale_Developer-5.1.0.3-x86_64-Linux.zip

  inflating: Spectrum_Scale_Developer-5.1.0.3-x86_64-Linux.README

  inflating: Spectrum_Scale_Developer-5.1.0.3-x86_64-Linux-install

 extracting: Spectrum_Scale_Developer-5.1.0.3-x86_64-Linux-install.md5

  inflating: SpectrumScale_public_key.pgp

# chmod +x Spectrum_Scale_Developer-5.1.0.3-x86_64-Linux-install

# ./Spectrum_Scale_Developer-5.1.0.3-x86_64-Linux-install

 This unpacks the software RPMs/Debs into /usr/lpp/mmfs/<version>. Why /usr/lpp/mmfs/ and not /opt? That's because Spectrum Scale was originally developed for use on AIX in the nineties, and that's where it went. So. Tradition.

You can look up the full installation instructions in the documentation: https://www.ibm.com/docs/en/spectrum-scale/5.1.0?topic=installing

Or follow along with my steps. There are two ways to install, manual or automatic. If you have a lot of systems to install automatic is really nice, but we'll create a singleton cluster so manual it is.

First step is to create the apt sources for Spectrum Scale (or yum repos if on RHEL/SLES):

NB: There is bug in this script in v5.1.0.3 (Sorry) change line 142;
from:  osVersion      = linux_dist[1][:2]
to:    osVersion      = ""

# /usr/lpp/mmfs/5.1.0.3/tools/repo/local-repo --repo

Creating repo: /etc/apt/sources.list.d/ganesha.list

Creating repo: /etc/apt/sources.list.d/gpfs.list

Creating repo: /etc/apt/sources.list.d/object.list

Creating repo: /etc/apt/sources.list.d/smb.list

Creating repo: /etc/apt/sources.list.d/zimon.list

Creating repo: /etc/apt/sources.list.d/gpfs2.list

 As these repositories are not signed, we need to enable unsigned repos:

# apt -o Acquire::AllowInsecureRepositories=true \

-o Acquire::AllowDowngradeToInsecureRepositories=true update

We can then install Spectrum Scale. Any dependencies should be resolved automatically with your standard repos. Updating works the same way.

# apt install gpfs.base gpfs.gpl gpfs.gskit gpfs.license.dev gpfs.docs gpfs.afm.cos gpfs.compression gpfs.gui gpfs.protocols-support gpfs.pmsensors gpfs.pmcollector gpfs.nfs-ganesha\* gpfs.smb* gpfs.pm-ganesha*

 Before we start building the cluster, we need to prepare the system.

First, add Spectrum Scale to the PATH:

# echo "export PATH=/usr/lpp/mmfs/bin:\$PATH" > /etc/profile.d/gpfs.sh

# source /etc/profile.d/gpfs.sh

Next we'll manually build the kernel extension for GPFS to test if it works. Perhaps a C-compiler is not installed, or there is a kernel problem that needs fixing. We'll make this process automatic at start time later.

# mmbuildgpl

Next step is to get the pre-requisites in order:

  1. Make the IP address you're using static or fix the assignment in your DHCP server. (really important, do not skip this step!)
  2. NTP time synchronization, check with timedatectl
  3. DNS/hosts,check with ping `hostname` and/or host `hostname`
    1. I'm adding two entries to /etc/hosts, one static for GPFS, one floating for NAS access:
      1. 192.168.178.199 scalenode1
      2. 192.168.178.200 nas1
    1. Firewall, either disable or configure correctly
      1. Ubuntu: ufw disable
      2. RHEL/SLES: systemctl disable firewalld --now
      3. https://www.ibm.com/docs/en/spectrum-scale/5.1.0?topic=topics-securing-spectrum-scale-system-using-firewall
    2. SSH permissions for issuing commands as root to all my cluster nodes:
      1. ssh-keygen -t rsa -N "" -f /root/.ssh/id_rsa
      2. cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys
      3. ssh -o StrictHostKeyChecking=no `hostname` date

 

Now we're ready to create the Spectrum Scale cluster:

# mmcrcluster -N `hostname`:manager-quorum -A

# mmchlicense server --accept -N all

# mmchconfig autoBuildGPL=yes

# mmlscluster

# mmstartup

 Check the state of the node, it may be in "arbitrating" mode for a while, it should go "active" automatically. If not, check your firewall, or check the logfile /var/mmfs/gen/mmfslog.

 # mmgetstate

 Node number  Node name        GPFS state

-------------------------------------------

       1      scalenode1       active

 

Now the cluster is running, we can create a filesystem. For this we need block devices, these can be all kinds of full devices (iSCSI, USB, SAS, SAN, NVMe) or partitions on those devices. Run lsblk to get a list. 

I have an internal MMC device which is a bit of an issue, as Spectrum Scale only looks for "regular" devices, check your devices with: mmdevdiscover. As my device is not listed, I'll need to add it using a custom script:

# cat > /var/mmfs/etc/nsddevices <<EOF

#!/bin/ksh

cat /proc/partitions | grep -v loop | grep '[0-9]' | while read x x x part

do

   echo \$part generic

done

return 1

EOF

# chmod +x /var/mmfs/etc/nsddevices

 
Now we can define the partition as an NSD (Network Shared Disk) which we do with a stanza file:

# cat > local.nsd << EOF

%nsd:

  device=/dev/mmcblk1p2

  nsd=mmc

  servers=`hostname`

  usage=dataAndMetadata

  failureGroup=1

  pool=system

EOF

 We name the NSD "mmc", and specify the partition. The server option is the list of servers that have direct access to this device, which is just this system, you need iSCSI or a FC-SAN to have shared acccess from multiple systems. The usage is default, we'll put both data (file data) and metadata (directories, inodes, structures, logs) on this device. The failureGroup is 1, as this is our first and only server. This value guides the data replication feature of GPFS to place copies on multiple systems. The pool is the default "system" pool which is mandatory for putting metadata in.

# mmcrnsd -F local.nsd

# mmlsnsd -M

  Disk name  NSD volume ID      Device          Node name    Remarks

---------------------------------------------------------------------------

 mm        C0A8B2C760746935   /dev/mmcblk1p2  scalenode1   server node

 

The NSD is now created, which means an NSD Identification number is written to the partition, and the device is registered in the cluster administration. Next job is to create a filesystem using this NSD.

 We'll build a default file system, with Automount, nfs4 ACLs enabled, default replicas set to 1 for data and metadata, and to 3 as a maximum. The mountpoint is set to /nas1, which neatly matches the special device name. You can change these later if you want, but not the maximum replica settings.

 # mmcrfs nas1 -F local.nsd -A yes -k nfs4 -r 1 -R 3 -m 1 -M 3 -T /nas1

 The file system is now ready! We just need to mount it:

# mmmount nas1

# mmlsdisk nas1

disk         driver   sector failure  holds    holds                storage

name         type       size group    metadata data  status  avail  pool

------------ -------- ------ -------- -------- ----- ------- ------ -------

mmc          nsd         512        1      yes   yes   ready up     system

# df -h /nas1

Filesystem      Size  Used Avail Use% Mounted on

nas1             29G  1,4G   28G   5% /nas1

Major changes like adding or removing disks or nodes can be done online via the command line. More user-oriented actions like adding an NFS or SMB export, creating snapshots, or setting file management policies can also be done using the GUI.

Stopping and starting the cluster is done using the following commmands:

# mmshutdown

# mmstartup

The next blog will show the creation of an NFS share.


 


#Highlights-home
#Highlights
1 comment
952 views

Permalink

Comments

15 days ago

Great series of blog articles Maarten, I appreciate this.