File and Object Storage

 View Only

 GPFS expansion query

Mike Lavery's profile image
Mike Lavery posted Mon February 16, 2026 04:38 AM

I have a customer with 20k CPU/GPU cores compute with servers connected directly to fast storage. Storage scale GPFS has been deployed and the customer is working with their HPC applications. 

We now have a 2nd phase of compute with GPFS nodes and storage. Same design again where the GPFS nodes are directly connected to their own high performance storage.

Once deployed we need to bring both phases together. As we have two GPFS server/storage deployments, I am assuming we can expand the current GPFS as the logical network layer. With LSF managing workload scheduling etc I assume this is how it should look from a GPFS perspective?

thoughts and help would be useful. thanks 

Roberto Renna's profile image
Roberto Renna

Hi Mike,

Your understanding of the “logical network layer” is correct and does indeed correspond to a configuration officially supported by IBM, but it’s important to clarify that it’s not the only option, and that nowhere in the IBM Storage Scale documentation is there a “merge” of two clusters into a single command (there is no ⁠ mmmergecluster ⁠).
 
The IBM documentation (“IBM Storage Scale cluster configurations”) lists four basic configurations: all nodes attached to a common set of LUNs; some nodes acting as NSD clients; a cluster distributed across multiple sites; and data shared between clusters. For your scenario (two identical server+storage deployments, each with direct-attached storage), there are two options that best fit.

SINGLE CLUSTER (expansion of the Phase 1 cluster) 
Add the Phase 2 nodes to the existing cluster using mmaddnode (“the new nodes are available immediately upon completion of the operation”). The Phase 2 storage is presented as new NSDs. To bring the file system already existing on Phase 2 into the single cluster, use mmexportfs / mmimportfs: mmexportfs retrieves the information needed to move a file system to a different cluster, while mmimportfs imports it into the destination cluster.
However, in a single cluster, each node either has direct access to the disks or accesses them via an NSD server. If the Phase 1 nodes are directly attached to their own storage and the Phase 2 nodes to theirs, in the merged cluster the nodes of one phase will access the other’s storage over the network as NSD clients, unless they share the same SAN. Also keep in mind that you cannot mix different operating systems to directly access the same set of LUNs on a SAN.
Result: single namespace, single administrative and security domain.

MULTICLUSTER WITH REMOTE MOUNT (this is what you call the “logical network layer”) 
The two clusters remain separate, and each mounts the other’s file system. This is the mode officially called “multicluster environment”: IBM Storage Scale clusters are managed independently but share access to data via remote cluster mount. The mechanism relies on three commands—mmauth, mmremotecluster, and mmremotefs, and each site remains managed as a separate cluster while allowing shared access to the file system. From an application perspective, the experience is identical to a local setup: once the remote file system is mounted, all access occurs as if you were on the host cluster.

However, in this case, you must be careful, as there are two constraints to observe:
·      Full connectivity: every node in the cluster that needs to access the other cluster’s file system must be able to open a TCP/IP connection to EVERY node in the other cluster.
·      To take advantage of high-speed HPC networks (InfiniBand) between the two clusters, you must configure mmchconfig subnets, as described in the documentation on multicluster with multiple NSD servers.

LSF essentially schedules jobs on the compute nodes, and those nodes simply need to have the file system mounted and visible with consistent paths; whether the mount is local (SINGLE CLUSTER) or remote (MULTICLUSTER WITH REMOTE MOUNT) is entirely irrelevant to the scheduler.
 Therefore, LSF neither determines nor constrains the GPFS topology; the choice between a single cluster and a multi-cluster setup is based on other criteria. Regarding this choice, IBM’s documentation clearly outlines the factors driving the decision, specifically referencing I/O performance requirements and application reliability; properties of the underlying storage hardware; and considerations related to administration, security, and ownership.

Essentially, if you need a single administrative domain and a single namespace, and the two phases are in the same data center with adequate networking, the SINGLE CLUSTER option is the way to go. If you want to maintain administrative and security separation between the two tiers (separate teams, ownership, independent maintenance) while still exposing the data to both, the remote mount multi-cluster option is more suitable and, let me add, the least invasive; in fact, each tier remains intact, and only the mount relationship is added.
In summary, even though I may have gone on a bit too long , your intuition regarding the logical network layer is correct and aligns with an officially supported configuration (the remote mount multicluster), but it is one of two IBM options. There is no single-command merge; in fact, you either expand one cluster by absorbing nodes and file systems from the other, or you leave two independent federated clusters. The decision is based on performance, hardware, and above all on the desired administrative/security model, not on LSF.

I’ll leave you with some official IBM references I’ve collected over time when I’ve ventured into and gotten lost in the intricacies of GPFS in the past, since it remains a product I fell in love with from the very first encounter, but like all relationships, there’s love and hate every now and then 😄
  
•⁠  ⁠IBM Storage Scale cluster configurations:
https://www.ibm.com/docs/en/storage-scale/5.2.3?topic=overview-storage-scale-cluster-configurations
•⁠  ⁠Adding nodes to a GPFS cluster (mmaddnode):
https://www.ibm.com/docs/en/storage-scale/5.2.2?topic=cluster-adding-nodes-gpfs
•⁠  ⁠mmexportfs command:
https://www.ibm.com/docs/en/storage-scale/5.2.2?topic=reference-mmexportfs-command
•⁠  ⁠mmimportfs command:
https://www.ibm.com/docs/en/storage-scale/5.2.2?topic=reference-mmimportfs-command
•⁠  ⁠Shared file system access among IBM Storage Scale clusters (multicluster):
https://www.ibm.com/docs/en/storage-scale/5.2.2?topic=sss-shared-file-system-access-among-storage-scale-clusters
•⁠  ⁠Accessing a remote GPFS file system (mmauth / mmremotecluster / mmremotefs):
see the chapter “Accessing a remote GPFS file system” in the IBM Storage Scale 5.2.x Administration Guide

I hope this was helpful, even if a bit long-winded.

Best regards
Roberto