What's New?
With the most recent release of IBM Storage Ceph 7.0 in December of 2023, there are a slew of new features unleashed: from NFS 4.1 support for CephFS, to Object lock certification for SEC and FINRA, and even dashboard enhancements to provide a better UI/UX and simplify our workflow and management. But one major improvement that flew under the radar, which is currently in Technical Preview, is the addition of a brand new NVMe over TCP storage gateway. This new addition to our product unlocks the performance, durability and parallelism of NVMe by exposing disk access over network transport. This enables high-performance, low-cost block storage without the additional need for expensive SAN fabrics or additional expert skill sets that come along with them.
IBM Storage Ceph NVMe over TCP Gateway
NVMe over TCP is an network transport based storage protocol that unlocks the performance, density and parallelism of NVMe drives by utilizing high-bandwidth, low-latency optical networks common to today's data centers. This provides a cost-effective and high performing solution for the growing storage needs of your large data sets, containers and virtual machines.
Here's a high-level diagram of what we'll create today:
We will setup an NVMe subsystem, which will define our boundaries for namespaces; we'll configure an NVMe Gateway Service and Listener, and then we'll attach NVMe namespaces to Ceph RBD images. In a follow-up post, we will connect an VMware ESXi 8.0 NVMe over TCP initiator to our Ceph NVMe Gateway and show how simple it is to create new VMFS datastores using this new functionality. NOTE: we will only setup a single subsystem in this example, as Tech Preview only supports a single gateway on a single node currently.
Getting Started
To kick things off, if you don’t already have IBM Storage Ceph 7.0, we’ve got you covered: another change in our 7.0 release was simplification of our trial license process (which can be found here). Here’s a brief high-level overview of pre-requisites:
Ceph Easy
If you’d like a big, red “Easy Button” for this install, then navigate over to our very own Chris Blum’s video on how easy it can be to install IBM Ceph 7.0. Chris does a fantastic job of quickly and thoroughly walking us through a Ceph installation from a few CLI strokes, on to finishing things with our simplified “Expand Cluster” workflow in the UI. Highly recommended for newcomers and tinkerers alike!
Diving In
Alright, now that we’ve got a base Ceph cluster installed, let’s jump into setting up and configuring NVMe over TCP! First you are going to want to bookmark this link. For the purpose of this guide, we will be providing you with explanations in detail of the setup of NVMe/TCP, and some may find the more robust nature of the installation guide to satisfy their needs for deeper information. We will be starting from a freshly installed IBM Storage Ceph 3-node cluster, with all nodes in the Ceph cluster, and all OSDs online/up.
NOTE: any code snippets typed as $variable can be considered user defined variables which you can substitute with your own if need be.
NVMe Gateway Prep and Installation
First we are going to login to our Ceph management node, and create an OSD pool, which is just a logical partition to store objects in Ceph:
ceph osd pool create $nvme_datastore01
Next we'll initialize that same pool for use as a RADOS Block Device (RBD). This tells Ceph that we want to use this pool as a block source:
rbd pool init $nvme_datastore01
Create an RBD block image on the RBD pool we just initialized (note: default size is in MB):
rbd create $nvme_image01 --size $disk_size --pool $nvme_datastore01
Scale the NVMeoF service, adding NVMeoF to node(s) in the cluster (in our example we will use only one node, nvme-demo-2):
ceph orch apply nvmeof $nvme_datastore01 --placement="$nvme-demo-2"
View the scaled NVMeoF service, and ensure an instance is running on the correct node:
ceph orch ls nvmeof
This may take a few moments, as the service is deployed to the appropriate node and started. Your result should show “Running: 1/1” before moving on to the next step:
If you have followed along with Chris on our #CephEasy – Ceph Installation via UI video, you may have already created a /etc/mylogin.json file with repo credentials from this previous step.
Otherwise we simply need to login to the repository from the ceph management node with the following one-liner:
podman login -u cp -p $entitlement_key_goes_here cp.icr.io
Here we’ll capture the NVMe Qualified Name of the system we applied the NVMeoF service (in this example, I login to nvme-demo-2 and run the following):
cat /etc/nvme/hostnqn
Back on nvme-demo-1, my management node, I’ll put this NQN into a variable so I can reuse it:
nqn=”$nqn.2014-08.org.nvmexpress:uuid:b7340342-f923-173a-a550-4d9d60f2ecd6”
NVMe Gateway Configuration
Now we will utilize NVMeoF CLI commands to create the NVMe subsystem, create an NVMe Gateway, attach the NQN to the NVMe Gateway and to create NVMe namespaces to be consumed by our NVMe initiators.
For the curious: an NVMe subsystem is simply a logical boundary for NVMe namespaces, and we will only be using a single subsystem in our example here. NQNs are "NVMe Qualified Names" - we can consider them unique identifiers per NVMeoF initiator or target. If you are familiar with iSCSI, you might recognize a similar naming scheme from IQNs, or "iSCSI Qualified Names"; they're almost identical in purpose.
First we create the NVMe subsystem:
podman run -it cp.icr.io/cp/ibm-ceph/nvmeof-cli-rhel9:latest --server-address $ip_of_node --server-port 5500 create_subsystem --subnqn $nqn --max-namespaces 256
This command may take some time depending upon your internet connection, but we are looking for “True” after all the work is done, like so:
Find the daemon name for the NVMeoTCP gateway and copy the full name:
ceph orch ps | grep nvmeof
NOTE: we will prepend “client.” to this daemon name, as this is required by the NVMe subsystem to create a listener using this service.
podman run -it cp.icr.io/cp/ibm-ceph/nvmeof-cli-rhel9:latest --server-address $ip_of_node --server-port 5500 create_listener -n $nqn -g client.$nvmeof.nvme_datastore01.nvme-demo-2.fidpef -a $ip_of_node -s 4420
We are now able to discover the NVMe over TCP Gateway as a target, but by default there are no approved initiators. We will need to add approved initiators to ensure the correct transport connections are made (comma seperated):
podman run -it cp.icr.io/cp/ibm-ceph/nvmeof-cli-rhel9:latest --server-address $ip_of_node --server-port 5500 add_host --subnqn $nqn --host "$nqn.2014-08.com.vmware:nvme:nvme-esxi"
***Alternately we can configure this NVMe target to allow all initiators, if we trust our local subnet***
podman run -it cp.icr.io/cp/ibm-ceph/nvmeof-cli-rhel9:latest --server-address $ip_of_node --server-port 5500 add_host --subnqn $nqn --host “*”
We have now assigned allowed hosts to our NVMe over TCP target, now we need to associate an NVMe block device (create_bdev) with our RBD image from earlier:
podman run -it cp.icr.io/cp/ibm-ceph/nvmeof-cli-rhel9:latest --server-address $ip_of_node --server-port 5500 create_bdev --pool $nvme_datastore01 --image $nvme_image01 --bdev $ceph_nvme_block01
Finally we will create the namespace associated with this block device:
podman run -it cp.icr.io/cp/ibm-ceph/nvmeof-cli-rhel9:latest --server-address $ip_of_node --server-port 5500 add_namespace --subnqn $nqn --bdev $ceph_nvme_block01
Now you have an NVMe Namespace that is accessible by the host NQNs specified in the previous steps, and the NVMe Gateway should be discoverable on the network over port 4420.
Congratulations!
You have now successfully created an NVMe Gateway on a Ceph cluster node, configured the NVMe subsystem, associated an RBD image with an NVMe block device, and exposed that block device via an NVMe namespace to your network. Remember: this is currently Tech Preview code, it only gets better from here.
Next blog in the series we will show you how to configure various NVMe-over-TCP Initiators to consume these namespaces, and how seamlessly this protocol can access high performance drives over the network.
Next steps and resources
IBM Storage Ceph
https://www.ibm.com/products/ceph
IBM Storage Ceph documentation
https://www.ibm.com/docs/en/storage-ceph
IBM Storage Ceph video demos
http://easy.ceph.blue
Shift Happens Ceph Cast videos link
https://ibm.biz/BdMKif
Why IBM?
Data matters. When planning high performance infrastructure for new or existing applications it’s easy to focus on compute resources and applications without proper planning for the data that will drive the results for the applications. Our products are all about solving hard problems faster with data.
IBM helps customers achieve business value with a clear data strategy. Our strategy is simple, unlock data to speed innovation, de-risk data to bring business resilience and help customers adopt green data to bring cost and energy efficiencies. Value needs to be delivered by connecting the multiple organizational data sources with business drivers to create business value that mean something to the organization. Many organizations focus on a single driver with a storage solution, but the best solution is driven by an infrastructure strategy than can accomplish most if not all the drivers for maximum benefits. Our story is not just about another storage product but is about innovation and a storage portfolio that is powered by our global data platform.
Contact IBM
https://www.ibm.com/contact#Contact+information