In part 3 of this blog article series, we demonstrated how easy it is to tier S3 object to tape using the IBM Storage Scale policy engine and the IBM Storage Archive Enterprise Edition command line. In this article we show how S3 object can selectively migrated and recalled to and from tape based on metadata and tags applied by the S3 user.
Recap the environment
Let’s briefly recap the environment we are working with. As summarized in the picture below, we have deployed NooBaa on a single-node Storage Scale and Storage Archive cluster:
One S3 account user1 was created. The user has two buckets, test1 and test2. The buckets are stored in file system path /ibm/fs1/buckets. The NooBaa configuration is stored in file system path /ibm/cesShared/noobaa.
The two buckets (test1 and test2) are sub-directories in the file system path /ibm/fs1/buckets. Here is the current file and directory structure of the file system hosting the buckets:
# tree /ibm/fs1/buckets/
/ibm/fs1/buckets/
├── test1
│ ├── file0
│ ├── file1
│ ├── file2
│ ├── file3
│ └── file4
└── test2
├── file10
├── file6
├── file7
├── file8
└── file9
All files are in resident state.
Now let’s look at tagging objects and using tags to control migration and recall.
Tagging objects
The AWS s3 API allows to add metadata and tags to objects. Tags and metadata are key-value pairs. In combination with the IBM Storage Scale file system, tags and metadata are stored as extended attributes with the file in the file system. These extended attributes can be used with the policy engine to control migration and recall.
There are two ways to apply tags or metadata to objects using the AWS S3 API:
- With the put-object operation additional metadata can be applied to the object during PUT.
- With the put-object-tagging operation tags can be applied to objects after PUT.
In our examples below we define a tag key action with the value of migrate or recall. Tagging and adding metadata can be done using the AWS s3api. To simplify the AWS CLI s3api command, I created an alias named s3u1api that encapsulates the keys for user1, the endpoint URL and the aws s3api command:
# alias s3u1api='AWS_ACCESS_KEY_ID= UG2XmYf5jasJrhW2iOWy AWS_SECRET_ACCESS_KEY= LjAESgsgMOkhozYHikq4TDgRzCHvmFyc4AKbYq/F AWS_ENDPOINT_URL=http://localhost:6001 aws s3api'
Store this alias in ~/.bashrc to re-use it later on.
Let’s look at adding metadata during the put-object operation. In this example the S3 user adds the metadata tag action=migrate to for object file0 in bucket test1:
# s3u1api put-object --bucket test1 --metadata action=migrate --body file0 --key file0
Note, that the put-object operation copies the object into the bucket. The object is stored as file /ibm/fs1/buckets/test1/file0.
The S3 user can read the metadata tags of an object using the head-object operation:
# s3u1api head-object --bucket test1 --key file0
{
"AcceptRanges": "bytes",
"LastModified": "Thu, 29 Feb 2024 11:22:49 GMT",
"ContentLength": 8688640,
"ETag": "\"mtime-czhivmmqljb4-ino-3do\"",
"ContentType": "application/octet-stream",
"Metadata": {
"action": "migrate",
"storage_class": "STANDARD"
}
}
As shown above the tag action=migrate was set for the object.
The metadata tag is stored as extended attribute of the file, as shown below using the mmlsattr command of the Storage Scale CLI:
# mmlsattr -L -d /ibm/fs1/buckets/test1/file0
file name: /ibm/fs1/buckets/test1/file0
metadata replication: 1 max 2
data replication: 1 max 2
immutable: no
appendOnly: no
flags:
storage pool name: system
fileset name: buckets
snapshot name:
creation time: Thu Feb 29 12:22:49 2024
Misc attributes: ARCHIVE
Encrypted: no
user.action: "migrate"
user.noobaa.content_type: "application/octet-stream"
user.storage_class: "STANDARD
As shown above, there is an extended attribute user.action=migrate, that corresponds to the tag applied to the object.
Another way to add tags to an object uses the put-object-tagging operation of the s3api. This operation works for existing objects in the bucket, it does not copy the object into the bucket. In the example below the S3 user adds tag action=migrate to file1 in bucket test1:
# s3u1api put-object-tagging --bucket test1 --key file1 --tagging TagSet='[{Key=action,Value=migrate}]'
As shown above the tag action=migrate is encoded in JSON with the TagSet parameter.
To read the tag the get-object-tagging operation can be used:
# s3u1api get-object-tagging --bucket test1 --key file1
{
"TagSet": [
{
"Key": "action",
"Value": "migrate"
}
]
}
The put-object-tagging operation does not open the file in the file system, it just adds an extended attribute in accordance to the tag to the file. The object file1 in bucket test1 is stored as file /ibm/fs1/buckets/test1/file1 and has the attribute is: user.noobaa.tag.action=migrate as shown below:
# mmlsattr -L -d /ibm/fs1/buckets/test1/file1
file name: /ibm/fs1/buckets/test1/file1
metadata replication: 1 max 2
data replication: 1 max 2
immutable: no
appendOnly: no
flags:
storage pool name: system
fileset name: buckets
snapshot name:
creation time: Thu Feb 29 12:13:36 2024
Misc attributes: ARCHIVE
Encrypted: no
user.noobaa.content_type: "application/octet-stream"
user.storage_class: "STANDARD"
user.noobaa.tag.action: "migrate"
Note, the TagSet parameter can also include multiple tags characterized by key and value. The following example defines two tags: action=migrated and project=noobaaS3
TagSet='[{Key=action,Value=migrate},{Key=project,Value=noobaaS3}]'
Each put-object-tagging operation overwrites the tag and the associated extended attribute of the file. To preserve old tags, provided these with the TagSet.
Depending on the operation used to apply tags or metadata, the name space of the extended attributes storing the tags is different. With the put-object operation the name space of the extended attributes is user.[tag-key]. With the put-object-tagging operation the name space of the extended attributes is user.noobaa.tag.[tag-key]. These different name spaces must be considered, when using the policy engine to select files with certain extended attributes.
Migrating tagged objects
In the example above we tagged two files (file0 and file1) in bucket test1 with the tag action=migrate. We can now use the policy engine to migrate these files to tape. The execution of the policy engine is decoupled from the S3 user who applies the tags. The policy engine is executed by an administrator or an automated program.
The policy to migrate files tagged with action=migrate looks like this:
/* RULE 1: define external pool */
RULE 'extpool' EXTERNAL POOL 'ltfs' EXEC '/opt/ibm/ltfsee/bin/eeadm'
OPTS '-p pool1@lib1'
/* RULE 2: Migration rule */
RULE 'mig' MIGRATE FROM POOL 'system' TO POOL 'ltfs' WHERE
(KB_ALLOCATED > 0) AND
( XATTR('user.noobaa.tag.action') like 'migrate' OR
XATTR('user.action') like 'migrate' )
The first rule defines the external pool. The second rule selects files to be migrated that have either the extended attribute user.action=migrated or user.noobaa.tag.action=migrated set.
The storage admin executes this policy stored in file mig-tagged.policy by using the following command:
# mmapplypolicy fs1 -P mig-tagged.policy
Subsequently, the storage admin can check the migration state of the files that were migrated:
# eeadm file state /ibm/fs1/buckets/test1/*
Name: /ibm/fs1/buckets/test1/file0
State: migrated
ID: 8642704550425025024-3560302634690375993-598196208-4380-0
Replicas: 1
Tape 1: DO0060L7@pool1@lib1 (tape state=appendable)
Name: /ibm/fs1/buckets/test1/file1
State: migrated
ID: 8642704550425025024-3560302634690375993-38891834-4377-0
Replicas: 1
Tape 1: DO0060L7@pool1@lib1 (tape state=appendable)
Name: /ibm/fs1/buckets/test1/file2
State: resident
As shown above the two files (file0 and file1) that we tagged before are not migrated.
Using object tagging the S3 user can apply metadata and tags that are used by the policy engine to migrate files. While the migration process is asynchronous to the tagging operation, migration can be done periodically, multiple times a day. Likewise, files can be tagged for recall, as demonstrated in the next section.
Recalling tagged objects
Like tagging object for migration, we can tag objects for recall. We can use the same tag key action with the value recall. We must use the put-object-tagging operation of the S3 API, because the put-object operation would PUT another copy of the object into the bucket and delete the old copy. The new copy of the object would be in resident state which makes a recall obsolete.
Let’s recall the two objects (file0 and file1) in bucket test1 that were migrated before. First, the S3 user tags the objects with the tag action=recall:
# s3u1api put-object-tagging --bucket test1 --key file0 --tagging TagSet='[{Key=action,Value=recall}]'
# s3u1api put-object-tagging --bucket test1 --key file1 --tagging TagSet='[{Key=action,Value=recall}]'
The S3 user can display the tags using the following command:
# s3u1api get-object-tagging --bucket test1 --key file0
{
"TagSet": [
{
"Key": "action",
"Value": "recall"
}
]
}
# s3u1api get-object-tagging --bucket test1 --key file1
{
"TagSet": [
{
"Key": "action",
"Value": "recall"
}
]
}
As shown above, both files are tagged with action=recall.
The storage admin can now use a policy that recalls files, that are tagged with action=recall and that are in migrated state. The policy looks like this:
/* MACRO: defining migrated state */
define(is_migrated, (MISC_ATTRIBUTES LIKE '%V%'))
/* RULE 1: define external pool */
RULE 'extpool' EXTERNAL POOL 'ltfs' EXEC '/opt/ibm/ltfsee/bin/eeadm'
/* RULE 2: Migration rule */
RULE 'rec' MIGRATE FROM POOL 'ltfs' TO POOL 'system' WHERE
is_migrated AND XATTR('user.noobaa.tag.action') like 'recall'
The first statement is a macro defining the migrated state of a file. The second statement “RULE 1” defines the external pool for recall. Note, that we omitted the OPTS clause. The third statement “RULE 2” is the recall rule that recalls files from pool ltfs to pool system that are migrated and that have the extended attribute user.noobaa.tag.action set to recall.
To execute this policy stored in file rec-tagged.policy the storage admin executes the following command:
# mmapplypolicy fs1 -P rec-tagged.policy
The storage admin checks the state of the two files:
# eeadm file state /ibm/fs1/buckets/test1/*
Name: /ibm/fs1/buckets/test1/file0
State: premigrated
ID: 8642704550425025024-3560302634690375993-961002687-4364-0
Replicas: 1
Tape 1: DO0060L7@pool1@lib1 (tape state=appendable)
Name: /ibm/fs1/buckets/test1/file1
State: premigrated
ID: 8642704550425025024-3560302634690375993-38891834-4377-0
Replicas: 1
Tape 1: DO0060L7@pool1@lib1 (tape state=appendable)
Name: /ibm/fs1/buckets/test1/file2
State: resident
Both files (file0 and file1) are in pre-migrated state, which means that they are on disk.
Because the recall policy checks if the files with the extended attribute user.noobaa.tag.action=recall is in migrated state, subsequent executions of the policy will not recall the file again because the file is no longer in migrated state.
The recall and migration scenarios based on tags include of two different players:
- The S3 user applies the tags to provide instructions for controlling migration and recall of objects.
- The storage admin defines and executes policies to migrate and recall files that implement the instructions provided by the S3 user through object tagging.
The S3 user update the tags of objects according to the needs. The storage admin executes migration or recalls based on the tags provided by the user. The tasks of the storage admin can be automated as shown in the next section.
Automation of migration and recall
The tag-based migration and recall processes can be automated using a scheduler. Two processes must be scheduled, one which performs the migration and one which performs the recalls.
For the recall process, we define a time interval specifying the maximum time the S3 user must wait before an object tagged with action=recall is recalled. In our example we define the recall time interval of 1 hours. This means that the S3 user must wait a maximum of 1 hours before the object which was tagged with action=recall is recalled.
The migration process runs in a time interval of 4 hours in our example. This time interval can be shorter or longer depending on the requirements.
The crontab entries are created for the root user because the execution of the policy engine requires root privileges. The crontab entries look like this:
# crontab -e
# crontab entry to run migration and recall
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/lpp/mmfs/bin:/usr/local/bin
# run recall at the top of every hour
0 * * * * mmapplypolicy fs1 -P /pol-path/rec-tagged.policy >> /log-path/recall.log 2>&1
# run migrate every 4 hours at half past the hour
30 0,4,8,16,20 * * * mmapplypolicy fs1 -P /pol-path/mig-tagged.policy >> /log-path/migrate.log 2>&1
The recall process runs at the top of every hour and executes the policy file rec-tagged.policy, that we created in section Recalling tagged objects. The migrate process runs every 4 hours at the 30th minute and executes the policy file mig-tagged.policy, that we created in section Migrating tagged objects. Both processes write the output in a separate log-file.
Now the storage admin can lean back and let migration and recall happen automatically. The S3 user can tag objects to control migration and recalls. For example, the S3 user can tag all objects in bucket test1 for migration:
for i in $(seq 0 4);
do
s3u1api put-object-tagging \
--tagging TagSet='[{Key=action,Value=migrate}]' \
--bucket test1 --key file$i;
done
The migration according to the schedule above will happen within 4 hours.
Likewise, the user can tag migrated objects in bucket test1 for recall:
for i in $(seq 0 4);
do
s3u1api put-object-tagging \
--tagging TagSet='[{Key=action,Value=recall}]' \
--bucket test1 --key file$i;
done
The automation of migration and recalls demonstrated above does not prevent the S3 user to GET objects that are in migrated state. S3 GET operations on objects that are migrated cause transparent recalls. Simultaneous S3 GET operations of many (thousands) migrated object cause many transparent recalls that can lead to recall storms. Each transparent recall copies one object from tape. This means each transparent recall mounts one tape to recall one file from tape. This is inefficient [7]. Therefore, it may be desirable to prevent transparent recalls and recall many objects in an optimized manner using bulk recalls. In the next section we explain how to prevent transparent recalls and use optimized or bulk recalls instead.
Enforcing optimized recalls
As explained above, transparent recalls triggered by GET object operations for many migrated objects can lead to recall storms. Recall storms endlessly mount and spool tapes to recall object by object. This is inefficient and time consuming. A more efficient way to recall many migrated objects are bulk recalls. With bulk – or optimized - recalls many files or objects are recalled in the order of the tape ID and their position on tape. Bulk recalls sort the files by tape ID and location on tape and recall the files in order. Multiple tapes are processed in parallel. Bulk recalls are much faster than transparent recalls [7].
To enforce bulk recalls, transparent recalls must be disabled. IBM Storage Archive allows to disable transparent recalls using a cluster option allow_transparent_recall. In the example below we disable transparent recalls:
# eeadm cluster set -a allow_transparent_recall -v no
2024-03-01 20:38:29 GLESL802I: Updated attribute allow_transparent_recall.
# eeadm cluster show
Attribute Value
allow_migrate yes
allow_premigrate yes
allow_save yes
allow_selective_recall yes
allow_tape_assign yes
allow_tape_datamigrate yes
allow_tape_export yes
allow_tape_import yes
allow_tape_offline yes
allow_tape_online yes
allow_tape_reclaim yes
allow_tape_reconcile yes
allow_tape_replace yes
allow_tape_unassign yes
allow_transparent_recall no
filehash_enable yes
filehash_verify_on_read yes
recall_use_rao auto
Let’s check at the migration state of the objects in the buckets test1:
# eeadm file state /ibm/fs1/buckets/test1/* | grep -E "Name:|State:"
Name: /ibm/fs1/buckets/test1/file0
State: migrated
Name: /ibm/fs1/buckets/test1/file1
State: migrated
Name: /ibm/fs1/buckets/test1/file2
State: migrated
Name: /ibm/fs1/buckets/test1/file3
State: migrated
Name: /ibm/fs1/buckets/test1/file4
State: migrated
The output of the command is shortened and shows that all files are migrated.
Now let’s get file0 from bucket test1 keeping in mind that transparent recalls are disabled:
# s3u1 cp s3://test1/file0 ./file0
download failed: s3://test1/file0 to ./file0 An error occurred (AccessDenied) when calling the GetObject operation: Access Denied
The copy operation from the bucket test1 failed because transparent recalls are disabled.
Let’s check the files action tag:
# s3u1api get-object-tagging --bucket test1 --key file0
{
"TagSet": [
{
"Key": "action",
"Value": "migrate"
}
]
}
The action tag is set to migrate. This means the file is most likely migrated.
Now let’s tag the file for recall:
# s3u1api put-object-tagging
--tagging TagSet='[{Key=action,Value=recall}]' --bucket test1 --key file0
# s3u1api get-object-tagging --bucket test1 --key file0
{
"TagSet": [
{
"Key": "action",
"Value": "recall"
}
]
}
Because of the automation of recalls explained in section Automation of migration and recall, the GET object operation for file0 in bucket test1 succeeded after a maximum time of 1 hour:
# s3u1 cp s3://test1/file0 ./file0
download: s3://test1/file0 to ./file0
Objects tagged with action=recall must eventually be tagged with action=migrate to assure that these objects are migrated again. This is responsibility of the S3 user.
Enforcing optimized recalls is disruptive to the S3 users because the S3 users cannot just GET objects that are migrated to tape. Instead, the S3 users must tag migrated objects to trigger the automated tape optimized recall. On the other hand, enforcing optimized recalls means less trouble for the administrator because it avoids recall storms. Clearly communicated service levels defining the maximum time for recalling objects help to set the expectations with the S3 user. Enforcing optimized recall is faster when recalling many objects scattered across many tapes. Furthermore, optimized recalls are more resource efficient because the number of tape mounts in much less and many files are copied from one tape during one mount operations. Finally, optimized recalls improve usability and efficiency of an S3 object storage with tape.
I’m currently looking into using AWS Glacier to control migration and recalls of S3 objects. Stay tuned, another blog article in this series may show up in a couple of weeks.