S3 tiering to tape with NooBaa Part 6 – Automating S3 Glacier operations
In part 5 of this series we explained the fundamentals of AWS S3 Glacier and demonstrated how it works with NooBaa in combination with IBM Storage Scale. We demonstrated manual migration and recall of S3 Glacier objects in IBM Storage Scale using IBM Storage Archive. In this article we show how migration and recall can be automated using the Storage Scale policies [3].
The automation of migration, recalls and re-migration is based on Storage Scale policies and scripts. Examples for these policies and scripts can be found on GitHub at [1]: https://github.com/IBM/spectrum-scale-policy-scripts/tree/master/noobaaGlacier
Note: the IBM Storage Scale S3 object storage service does not yet support the AWS Glacier API. The methods presented in this blog article cannot be used with IBM Storage Scale S3 Service.
Recap the Glacier behavior
Before we dive deeper into policies, let’s recap the behavior of S3 Glacier with NooBaa storing objects as files in Storage Scale file systems. The picture below shows the flows and highlights what happens under the covers of NooBaa in Storage Scale:
Starting with the first gray-colored block. The S3 user PUTs an object with the parameter storageclass set to GLACIER. As a result of this NooBaa adds the extended attribute user_storageclass=GLACIER to the file representing the object in the file system. The automatic migration migrates all objects that are not already migrated and that have the user attributes user_storageclass=GLACIER set (more details in section Automatic migration and re-migration). If the user tries to GET the object, then this GET request fails, because the file is not ready for retrieval. When doing a head-object request, the S3 user can see that there is no ongoing restore request. This indicates that the object cannot be accessed using the standard GET request.
In the second blue-colored block the S3 user issues the restore-object request for the object with 1 day expiration period. As a result NooBaa sets the extended attribute user.noobaa.restore.request=1 for the associated file. The head-object request now shows that there is an ongoing restore request that has not completed yet.
In the next, orange-colored block the automatic recall takes place. The automatic recall recalls files that have the extended attributes in user_storageclass=GLACIER and user.noobaa.restore.request set, and that are in migrated state. After the recall succeeded the extended attribute user.noobaa.restore.expiry is set in accordance to the expiration period and the extended attribute user.noobaa.restore.request is deleted (see section Automatic recall).
Now the S3 user can GET the object as shown in the green-colored block. When the user does a head-object request, then he can determine that there is no ongoing restore request. He also sees the expiration time when the object will be re-migrated. Within the expiration time the user can access the object using the standard GET request. When the expiration time has expired, then the S3 user cannot GET the object anymore, regardless of the objects’ migration state.
The last, gray-colored block shows the re-migration of objects. The automatic re-migration process migrates all files that are not in migrated state and where the expiration time encoded in extended attribute user.noobaa.restore.expiry is expired. Optionally, the automatic re-migration process can remove the extended attribute user.noobaa.restore.expiry (see section Automatic migration and re-migration)
The processes boxed in a red rectangle in the picture above are explained in this blog article. Automatic migration, automatic recall and automatic re-migration processes are implemented with Storage Scale policy rules and executed with the Storage Scale policy engine in accordance with schedules [1].
Note, the policies presented in this article process S3 Glacier objects. S3 objects that are not associated with the storage class GLACIER are not processed, even though NooBaa S3 allows storing GLACIER and non-GLACIER object (storage class STANDARD) in the same bucket. NooBaa S3 allows to prevent storing non-GLACIER objects with configuration parameter:
DENY_UPLOAD_TO_STORAGE_CLASS_STANDARD = true
This configuration parameter can be entered to the config.json file located in the NooBaa configuration directory (see Part 2). After adding this parameter, the NooBaa S3 service must be restarted.
Automatic migration and re-migration
Migration of new objects in the Glacier storage class and re-migration of objects that were recalled and where the expiration time has expired can be combined in one policy.
Here is an example of the migration and re-migration policy, more details can be found in [1]:
/* define macro */
define(is_migrated, (MISC_ATTRIBUTES LIKE '%V%'))
/* exclude rule */
RULE 'exclude' EXCLUDE WHERE
(PATH_NAME LIKE '%/.SpaceMan/%' OR
PATH_NAME LIKE '%/.ltfsee/%' OR
PATH_NAME LIKE '%/.mmSharedTmpDir/%' OR
PATH_NAME LIKE '%.mmbackupCfg/%' OR
PATH_NAME LIKE '%/.snapshots/%' OR
NAME LIKE '.mmbackupShadow%' OR
NAME LIKE 'mmbackup%')
/* Migrate policy for testing */
RULE 'extPool' EXTERNAL POOL 'ltfs' EXEC '/opt/ibm/ltfsee/bin/eeadm'
OPTS '-p pool1@lib1 SIZE 20971520
RULE 'migGlacier' MIGRATE FROM POOL 'system' TO POOL 'ltfs' WHERE
FILE_SIZE > 0 AND
xattr('user.storage_class') = 'GLACIER' AND
NOT (is_migrated) AND
(( xattr('user.noobaa.restore.request') IS NULL AND
xattr('user.noobaa.restore.expiry') IS NULL )
OR
CURRENT_TIMESTAMP >= TIMESTAMP(CONCAT(CONCAT(SUBSTR(xattr('user.noobaa.restore.expiry'), 0, 10), ' '), SUBSTR(xattr('user.noobaa.restore.expiry'), 12, 8))) )
The first clause defines a macro specifying the migrated state. The second rule excludes certain files and directories from migration. The third rule defines the external pool ltfs that is managed by Storage Archive and migrates to pool pool1@lib1. The last rule is the migration rule that selects files that are eligible for migration.
The migration rule selects files that match the following conditions:
- Extended attribute user.storage_class set to GLACIER,
- File size is greater than 0 bytes (0 byte files cannot be migrated by Storage Archive and have to be saved using the eeadm save command),
- File is not in migrated state,
- Extended attributes user.noobaa.restore.request and user.noobaa.restore.expiry are not set OR time encoded in extended attribute user.noobaa.restore.expiry is expired relative to the current time stamp.
The last conditions assures that files are remigrated if the expiration time is expired.
To execute this policy, store the rules in a file (migrate.pol) and execute the rules using the policy engine with the command:
# mmapplypolicy [path-or-device] -P migrate.pol -m [drives-1]
-N [archiveNodes] -B [bucket-size]--single-instance
The parameters of the mmapplypolicy command are:
Path-or-device: File system path where the objects are stored.
-P: name of the file containing the migrate policy, e.g. migrate.pol.
-m: number of parallel threads per node. The maximum number of threads should be equivalent to the number of drives per node. To allow additional recalls, set the parameter to number of drives per node minus 1.
-N: node names or node class executing the policy. Include all nodes that run Storage Archive.
-B: bucket-size per migrate thread. Depends on the size of the files and how many files are selected by the policy in average. Chose a number between 1000 and 20000 files.
--single-instance: ensures that only one instance of this policy is executed at a time.
The MIGRATE rule shown above migrates new objects in storage class GLACIER without timely delay. If you want to migrate GLACIER objects that were not accessed for 5 days, the MIGRATE rule can be adjusted as shown below:
/* define macro for access_age */
define( access_age,(DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME)) )
RULE 'migGlacier' MIGRATE FROM POOL 'system' TO POOL 'ltfs' WHERE
FILE_SIZE > 0 AND
xattr('user.storage_class') = 'GLACIER' AND
NOT (is_migrated) AND
(( xattr('user.noobaa.restore.request') IS NULL AND
xattr('user.noobaa.restore.expiry') IS NULL AND
access_age > 5 )
OR
CURRENT_TIMESTAMP >= TIMESTAMP(CONCAT(CONCAT(SUBSTR(xattr('user.noobaa.restore.expiry'), 0, 10), ' '), SUBSTR(xattr('user.noobaa.restore.expiry'), 12, 8))) )
The first clause defines a macro for access age that calculates the days of last access. The second rule is the migrate rule with an added statement selecting new GLACIER objects that were not accessed for 5 days.
The migration and re-migration policy may run once or twice a day. To automate the policy run, the mmapplypolicy command above can be scheduled using cron. For a more robust automation of the policy run, explore the Storage Scale automation framework [2].
Automatic recall
Objects must be recalled after the S3 user issued a restore-object request. NooBaa signals this with the extended attribute user.noobaa.restore.request in the file associated with the object. The automatic recall includes two steps:
- Recall migrated files that have the extended attribute user.noobaa.restore.request set, see section Recall objects.
- Adjust extended attributes user.noobaa.restore.request and user.noobaa.restore.expiry for the files that were successfully recalled, see section Adjust attributes.
Both steps run sequentially one after the other.
Recall objects
Tape optimized recalls are implemented with an EXTERNAL LIST policy. The EXTERNAL LIST policy accommodating the recall consists of two rules as shown below:
RULE 'extlist' EXTERNAL LIST 'recall' EXEC '/path-to/recallGlacier.sh'
RULE 'listRec' LIST 'recall' FOR FILESET('buckets') WHERE
xattr('user.storage_class') = 'GLACIER' AND
is_migrated AND
xattr('user.noobaa.restore.request') IS NOT NULL
The first rule defines an external program that is invoked with the file names selected by the second rule. In this example the external program is recallGlacier.sh, an example of this script can be found on [1].
The second rule defines the selection criteria for files to be migrated. The second rule selects files that are migrated, where the extended attribute user.storage_class is set to GLACIER and the extended attribute user.noobaa.restore.request is set to a value denoting the expiration period.
The files selected by the second rule are passed to the external program (recallGlacier.sh) in file list. The file list includes the path and file name of the selected files. The external program recallGlacier.sh recalls the file list in a tape optimized fashion by using the command:
The file list (filelist) is provided by the policy engine to the script and contains the path and file name of the selected files matching the second rule above.
The exact recall policy (recall.pol) can be found on [1]. To execute the EXTERNAL LIST policy, use the following command:
# mmapplypolicy [path-or-device] -P recall.pol -m [drives-1]
-N [archiveNodes] -B [bucket-size] --single-instance
The parameters of the mmapplypolicy command are:
Path-or-device: File system path where the objects are stored.
-P: name of the file containing the recall policy, e.g. recall.pol.
-m: number of parallel threads per node. The maximum number of threads should be equivalent to the number of drives per node. To allow additional migrates, set the parameter to number of drives per node minus 1.
-N: node names or node class executing the policy. Include all nodes that run Storage Archive.
-B: bucket-size per thread. Depends on the size of the files and how many files are selected by the policy in average. Chose a number between 1000 and 20000 files.
--single-instance: ensures that only one instance of this policy is executed at a time.
The recall policy should run in accordance with the agreed service levels that specify how long an S3 user must wait after requesting access to an object by issuing the restore-object request. For example, of the agreed service level is 2 hours, then this policy should run every 1,5 hours.
To automate the policy run, the mmapplypolicy command above can be scheduled using cron. For a more robust automation of the policy run, explore the Storage Scale automation framework [2].
Adjust attributes
After the recall completed, the extended attribute user.noobaa.restore.expiry must be set in accordance to the restore request period, and the extended attribute user.noobaa.restore.request can be deleted. The restore request period was provided with the restore-object request initiated by the S3 user. The restore request period specifies the number of days the object should be available for GET after the recall. The expiration period encoded in extended attribute user.noobaa.restore.expiry is set to the current date plus the number of days of the restore request period.
Adjusting the extended attributes is accomplished with another EXTERNAL LIST policy that is shown below:
RULE 'extlist' EXTERNAL LIST 'setExpiry' EXEC '/path-to/setExpire.sh'
RULE 'listFiles' LIST 'setExpiry' FOR FILESET('buckets') WHERE
xattr('user.storage_class') = 'GLACIER' AND
NOT (is_migrated) AND
xattr('user.noobaa.restore.request') IS NOT NULL
The first rule defines an external program that is invoked with the file names selected by the second rule. In this example the external program is setExpire.sh, an example of this script can be found on [1].
The second rule defines the selection criteria for files. The second rule selects files that are not migrated, where the extended attribute user.storage_class is set to GLACIER and the extended attribute user.noobaa.restore.request is set to a value denoting the expiration period.
The external program setExpire.sh set the attributes for the selected files. The selected files are provided as file list by the policy engine. For each file in the file list, the setExpire.sh script performs the following steps:
- Calculates expiry-date based on current time plus the time period in days encoded in the attribute user.noobaa.restore.request
- Set the attribute user.noobaa.restore.expiry to the value of the calculated expiry-date by using the command:
mmchattr –set-attr user.noobaa.restore.expiry=[expiry-date]
- Remove the attribute user.noobaa.restore.request by using the command:
mmchattr --delete-attr user.noobaa.restore.request
The exact policy for adjusting the attributes after recall (setexpire.pol) can be found on [1]. To execute the EXTERNAL LIST policy, use the following command:
# mmapplypolicy [path-or-device] -P setexpire.pol -m [num] -N [nodes]
-B [bucket-size] --single-instance
The parameters of the mmapplypolicy command are:
Path-or-device: File system path where the objects are stored.
-P: name of the file containing the recall policy, e.g. setexpire.pol.
-m: number of parallel threads per node. This is not limited to the number of drives per node. Start with several threads between 4 and 10.
-N: node names or node class executing the policy. This can be any nodes in the cluster that can attend policy runs.
-B: bucket-size per thread. Depends on the size of the files and how many files are selected by the policy in average. Chose a number between 1000 and 20000 files.
--single-instance: ensures that only one instance of this policy is executed at a time.
The policy adjusting the extended attributes must run right after the recall policy has finished. It sets the attributes for all files that were successfully recalled.
To automate the policy run, the mmapplypolicy command above can be scheduled using cron. For a more robust automation of the policy run, explore the Storage Scale automation framework [2].
Summary
The automation of migration, recall and re-migration for S3 GLACIER objects based on Storage Scale policies works seamless once the policies were programmed and tested. The examples provided in this article are a good starting point. More information about Storage Scale policies can be found in [3].
For the S3 user, Glacier is transparent. The user stores objects in the NooBaa S3 object storage and the associated files are automatically migrated to tape. The user knows if an object is accessible or not by leveraging the head-object request. This request shows if a restore (or recall) is scheduled. If no recall is scheduled, then the user can schedule it using the restore-object request. The automated recall process will recall the file within the agreed time. Using the head-object request, the user can see when the object is ready for read and he can use the standard GET request to read the object from the bucket.
The automation of recalls allows for implementation of service level agreements that specify how long a S3 user must wait until migrated objects are available for access. The waiting time for objects shall not be too narrow because each policy run takes some time. For large file systems with 100 of millions of objects it may take minutes to hours. The duration of policies runs must be considered when defining service levels.
The implementation presented in this article is based on three different policies that must run sequentially. Recall policies (recall and adjust attributes) run more often than the migrate policy. In extremely large environments policy runs may take very long (several hours) for each policy. For such environments the policy-based automation is not practical. NooBaa offers a log-based approach allowing minimize the policy runs. This log-based is explained in one of the next articles, stay tuned.
References
[1] Repository with examples for Glacier policies:
https://github.com/IBM/spectrum-scale-policy-scripts/tree/master/noobaaGlacier
[2] Repository with example for automating policy runs:
https://github.com/IBM/SpectrumScaleAutomation
[3] IBM Storage Scale ILM and Archiving Policies - A practical Guide:
https://www.ibm.com/support/pages/node/6260749