I have a customer with 20k CPU/GPU cores compute with servers connected directly to fast storage. Storage scale GPFS has been deployed and the customer is working with their HPC applications.
We now have a 2nd phase of compute with GPFS nodes and storage. Same design again where the GPFS nodes are directly connected to their own high performance storage.
Once deployed we need to bring both phases together. As we have two GPFS server/storage deployments, I am assuming we can expand the current GPFS as the logical network layer. With LSF managing workload scheduling etc I assume this is how it should look from a GPFS perspective?
thoughts and help would be useful. thanks