File and Object Storage

 View Only

S3 tiering to tape with NooBaa Part 2 - Deploying NooBaa on IBM Storage Scale

By Nils Haustein posted Fri February 23, 2024 06:00 AM

  

As explained in part 1 of this series, the solution architecture comprises the open-source software NooBaa that is installed on one or more IBM Storage Scale cluster nodes. NooBaa provides the S3 object storage endpoints to the S3 users and applications. S3 objects and buckets are stored on disk managed by IBM Storage Scale file systems.

The NooBaa configuration files and object data are stored in distinct shared directories of IBM Storage Scale file systems that are accessible by all NooBaa nodes. The NooBaa configuration files include account and bucket configuration as well as NooBaa service customization. The NooBaa configuration directory can be in the same file system where the buckets and objects are stored or in a different file system. It is recommended to use a different file system for the NooBaa configuration files.

In this blog article we demonstrate how to install and configure NooBaa as system service and how to use the S3 endpoints provided by NooBaa with the AWS CLI.

To deploy NooBaa in IBM Storage Scale logon to one or more cluster nodes and execute the following steps.

Install NooBaa-core standalone

Installing NooBaa on an IBM Storage Scale cluster node requires the NooBaa rpm-package. The NooBaa rpm-package for Red Hat Enterprise Linux can either be build [3] or downloaded from the nightly build server [4].

The example below shows how to pull a nightly build package from January9th, 2024 for RHEL 8. If you run RHEL 9 you must pull the package for this release (e.g., noobaa-core-5.15.0-20240109.el9.x86_64.rpm)

# curl -O https://noobaa-core-rpms.s3.amazonaws.com/noobaa-core-5.15.0-20240109.el8.x86_64.rpm

After downloading or building, install the NooBaa package:

# yum install NooBaa-core-5.15.0-20240109.el8.x86_64.rpm

Customize and start the NooBaa service

In our example, we install NooBaa on a single node. The NooBaa configuration files are stored in shared file system /ibm/cesshared, the buckets are stored in shared file system /ibm/fs1. In the next step the NooBaa service can be customized.

Create a directory for the NooBaa configuration files in the configuration file system. The directory must have read-write permissions for the NooBaa service user. The NooBaa servicer user in our example is root.

# mkdir -p /ibm/cesshared/noobaa

Redirect the default NooBaa configuration directory to our configuration directory:

# echo "/ibm/cesshared/noobaa" > /etc/noobaa.conf.d/config_dir_redirect

Create a configuration file for the NooBaa service in the shared configuration directory (/ibm/cesshared/noobaa/config.json). The parameter GPFS_DL_PATH is required and must point to the libgpfs.so library file. The parameter NSFS_NC_ALLOW_HTTP is optional and allows for http connections. Other configuration parameters shown below are an example, for more information refer to [5].

# cat /ibm/cesshared/noobaa/config.json
{

    "ENDPOINT_FORKS": 0,
  "UV_THREADPOOL_SIZE": 64,
  "ALLOW_HTTP": true,
"GPFS_DL_PATH": "/usr/lib64/libgpfs.so"
}

The NooBaa package installation provides a system service unit named: noobaa_nsfs.service.  No further adjustments are needed for the system service unit. To view the unit file, use the command:

# systemctl cat noobaa_nsfs.service
[Unit]
Description=The NooBaa nsfs service.
After=gpfs-wait-mount.service

[Service]
Restart=always
RestartSec=2
User=root
Group=root
ExecStartPre=/usr/local/noobaa-core/bin/node /usr/local/noobaa-core/src/upgrade/upgrade_manager.js --nsfs true --upgrade_scripts_dir /usr/local/noobaa-core/src/upgrade/nsfs_upgrade_scripts
ExecStart=/usr/local/noobaa-core/bin/node /usr/local/noobaa-core/src/cmd/nsfs.js
EnvironmentFile=-/etc/sysconfig/noobaa_nsfs
ExecStop=/bin/kill $MAINPID
WorkingDirectory=/usr/local/noobaa-core/
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

To activate the noobaa_nsfs service unit, use this command:

# systemctl start noobaa_nsfs

Check if the noobaa_nsfs service was successfully started and check the logs when required:

# systemctl status noobaa_nsfs

# journalctl [-f] -u noobaa_nsfs

If there are errors during startup, then investigate it. Sometimes there are problems loading the libgpfs.so, especially if the NooBaa and GPFS versions are not compatible. In this case, remove the parameter Environment=GPFS_DL_PATH=/usr/lib64/libgpfs.so from the service unit.

Once the NooBaa service was started, accounts and optionally buckets can be created, as shown in the next section.

Create accounts and buckets

In the next step accounts and buckets can be created. An account is an S3 user with an access key and secret key to access the NooBaa S3 endpoint and buckets. Objects are stored in buckets. Buckets are directories and objects are files in the underlying IBM Storage Scale file systems.

Accounts and buckets can be created with a supplemental tool called: manage_nsfs.js. To make the use of this tool more user-friendly, create an alias named manage_nsfs:

# alias manage_nsfs='/usr/local/noobaa-core/bin/node /usr/local/noobaa-core/src/cmd/manage_nsfs.js'

This alias can also be stored in ~/.bashrc to make it available for later sessions.

In this example, we create an account named user1. This S3 user account manages objects in buckets that are stored in the file system directory /ibm/fs1/buckets.  Buckets and files are stored with root user permissions (UID=GID=0). Different UID and GID can be used and requires the bucket path permissions to be set accordingly. To make it simple, we just use UID=GID=0.

# manage_nsfs account add --name user1 --uid 0 --gid 0 --new_buckets_path /ibm/fs1/buckets/

Upon successful completion of this command, the access_key and secret_key are provided with the command output formatted in JSON:

"access_keys": [
  {
     "access_key": "UG2XmYf5jasJrhW2iOWy",
     "secret_key": "LjAESgsgMOkhozYHikq4TDgRzCHvmFyc4AKbYq/F"
  }
]

To list the accounts the following command can be used:

# manage_nsfs account list

To list more details about a specific account, use this command:

# manage_nsfs account list --name user1 --show_secrets --wide

Buckets can be created using the manage_nsfs tool or via the S3 endpoint. Find below an example to create a bucket named test1 in directory /ibm/fs1/buckets/test1/ with owner set to user1:

# mkdir -p /ibm/fs1/buckets/test1/
# manage_nsfs bucket add --name test1 --path /ibm/fs1/buckets/test1/      --owner user1

To create buckets for user- and group-ID that are different than 0, then the permission of the bucket path (parameter –path) must be adjusted to allow other UID and GID access to the path.

To list the buckets the following command can be used:

# manage_nsfs bucket list

To list more information about a specific bucket, the following command can be used:

# manage_nsfs bucket list --name test1 --wide

Now let’s move on and access the S3 endpoints provided by NooBaa.

Use AWS CLI to access NooBaa S3 endpoints

Using the AWS CLI can be done on any server that has an IP connection to the IBM Storage Scale cluster nodes where NooBaa is deployed. The communication between the S3 client and the NooBaa storage services is done via http port 6001 or https port 6443. These default ports can be adjusted, either in the config.json, or when you start NooBaa (see section Customize and start the NooBaa service).

In this example we use the AWS CLI on the same node where the NooBaa service is deployed. It is recommended to install the AWS CLI as non-root user:

# pip3 install awscli --user

We created an account user1 with access key and secret key in NooBaa. Now we configure this account with the AWS CLI. For this we need the access key and secret key that we created in section Create accounts and buckets. Run the following command and enter the access and secret key when asked to do this:

# aws configure --profile user1

We are ready to access the S3 object storage service provided by NooBaa. Using https requires to create and configure SSL certificates. To make it simple we are using http. To list the buckets in our endpoint run the following command:

# aws --profile user1 --endpoint http://localhost:6001 s3 ls

To simplify the command, I created an alias named s3u1 for user1, the endpoint URL and the s3 command prefix:

# alias s3u1='aws --profile user1 --endpoint http://localhost:6001 s3'

Store this alias in ~/.bashrc to re-use it later on.

With this alias the command line become shorter. To list the content of the bucket test1 I simply use the alias s3u1:

# s3u1 ls s3://test1

With this the command become shorter. List the content of the bucket test1:

# s3u1 ls s3://test1

Create another bucket named test2 and list all buckets:

# s3u1 mb s3://test2

# s3u1 ls s3://

PUT a file into bucket test1:

# s3u1 cp filename s3://test1

# s3u1 ls s3://test1

PUT a bunch of files store in /tmp/files into the bucket test2:

# s3u1 cp --recursive /tmp/files/ s3://test2

# s3u1 ls s3://test2

GET a file from bucket test2 and store it locally:

# s3u1 cp s3://test2/filename ./

There are many more AWS CLI S3 commands available. To get help use:

# aws s3 help

And explore the s3api option with the AWS CLI:

# aws s3api help

In the next article of this series - part 3 -, we explore how to tier buckets and objects that are stored as directory and files in the file system to tape. For hands-on, you need an IBM Storage Scale cluster with IBM Storage Archive installed. 

2 comments
55 views

Permalink

Comments

25 days ago

Thank you @Xuchu Jiang for the feedback. I corrected this type which came from the systemctl cat command. Also adjusted a few more things.

25 days ago

Is this a typo in the noobaa nsfs service section : "--upgrade_scripts>" " -->     "--upgrade_scripts_dir"