In this blog article I describe some challenges and solutions when using storage tiering and backup functions in IBM Spectrum Scale file systems.
The backup function is based on the IBM Spectrum Scale mmbackup program that performs file level backup using the IBM Spectrum Protect client and server architecture in a scalable fashion. The IBM Spectrum Protect server can use tape storage pools to store the file copies on cost efficient storage media. The goal of backup is the ability to restore files in case of operational failures.
The storage tiering function allows migrating files from disk to tape. Storage tiering to tapes can be accomplished with IBM Spectrum Archive Enterprise Edition or IBM Spectrum Protect for Space Management. IBM Spectrum Scale provides the policy engine allowing to automate user defined migration rules. The goal of storage tiering is saving storage cost by storing files on the most appropriate storage medium. Especially, huge volumes of large files that are accessed seldom and must be kept for long period of time can benefit from cost savings when stored on tapes. Because tapes provide large storage capacities and do not consume power.
Best practices when combining storage tiering and backup
One challenge when using backup and storage tiering in the same IBM Spectrum Scale name space (file system or fileset) is when migrated files must be backed up. In this case, the backup operation requires the migrated files to be recalled. The file recall operation is “expensive” because it requires tapes to be mounted and spooled. The following best practices help to address this challenge:
- Schedule backup prior to migration to prevent migrated files to be recalled for backup.
- Schedule backup daily to ensure that new and modified files are timely backed up, prior to migration.
- Schedule migration 2 – 3 days after file creation or modification, using appropriate policy rules. This gives sufficient time for backup before migration.
- Use the mmbackup option --skip-migrated to prevent recall storms.
IBM Spectrum Protect for Space Management specific recommendations:
- Set the IBM Spectrum Protect management class parameter MIGREQUIRESBACKUP to yes. This prevents files to be migrated before they are backed up.
IBM Spectrum Archive specific recommendations:
- Use the option --mmbackup path-to-shadow-db in conjunction with the migration process. This option will check if a file selected for migration was backed up. If this is not the case then the file is skipped for migration.
When following these best practices there is a good chance that backup and storage tiering runs smoothly in an IBM Spectrum Scale name space. However, there is one more challenge: Access Control List (ACL).
The challenge with Access Control List (ACL)
Access Control List (ACL) allow fine grained access control to files for users and groups. Files and directories in an IBM Spectrum Scale name space have ACL when the file system parameter -k is set to nfs4 or all. This file system parameter is required when exporting directories to SMB or NFS v4 client.
When files have ACL, then these ACL are backed up along with the file. In fact, the ACL are not stored in the IBM Spectrum Protect server database, but with the file in the IBM Spectrum Protect storage pool. When using tape storage to store the backup data, then the ACL are stored on tape, along with the file data.
On the other hand, ACL are not considered for storage tiering. When a file is migrated, then the file metadata including the ACL remains in the file system, just the file data is stored on tape.
ACL can be inherited. This means if the ACL on a directory level is changed by an authorized user or administrator, then the ACL for all files and sub-directory within this directory change as well. Thus, a simple ACL change on a higher-level directory can change the ACL for many files and directories.
When changing the ACL, then the change time (CTIME) of the file is updated to the current time stamp. If the file data is modified, then the modification time (MTIME) of the file is updated to the current time. The IBM Spectrum Scale backup program mmbackup identifies file candidates for backup based on the CTIME and MTIME. If only the CTIME has changed, then this is an indication that ACL have changed. If only MTIME has changed, then this is an indication that the file data has changed.
Let’s see what happens in terms of storage tiering and backup when ACL are changed for a file in a file system with ACL enabled:
- If the file is not migrated and the ACL is changed, then the file will be backed up during the next backup operation. If only the ACL has changed, but not the content of the file, then the backup client will just send the ACL to the IBM Spectrum Protect server and not the entire file. The ACL is stored along with the file on tape by the IBM Spectrum Protect server.
- If the file is migrated and the ACL is changed, then the file is not recalled. However, the next backup operation requires a backup of this file because the ACL has changed. If the mmbackup option --skip-migrated is used, then the file will not be recalled, and the backup process ends with a warning message. This warning message informs the administrator that files have been skipped from backup and provides a list of skipped files. If mmbackup is executed with the option --backup-migrated, then the files with changed ACL will be recalled during the backup operation.
When ACL are changed for many files that are migrated, then these migrated files must be recalled and backed up again to copy the new ACL to the IBM Spectrum Protect server. If ACL are changed frequently for many files then the storage administrator has a lot of extra work to recall, backup, and migrate files.
Solving the ACL challenge
As demonstrated above, the challenge is the backup operation that requires migrated files to be recalled, when ACL have changed. Let me introduce three IBM Spectrum Protect backup client options that deal with backing up ACL:
SKIPACL: ACL will not be backed up at all. Use this option if you do not care about ACL on restore. On restore the file will have ACL set according to POSIX permissions.
SKIPACLUPDATECHECK: The initial ACL are backed up and all subsequent ACL updates are not backed up. On restore the file will have the initial ACL. This option is mutual exclusive with the client option SKIPACL.
UPDATECTIME: Updates the change time (CTIME) in the IBM Spectrum Protect server, if the CTIME has changed. One reason for the CTIME to change is when ACL are changed. Updating the CTIME in the IBM Spectrum Protect server is required to synchronize the file time stamps between the file system and the IBM Spectrum Protect server. The CTIME update in the IBM Spectrum Protect server does not require a recall if the file is migrated.
When setting client option SKIPACL yes along with the option UPDATECTIME yes in the dsm.opt file, then the backup operation will not backup ACL at all. ACL changes for migrated files do not cause recalls (or skipped files respectively) with the next backup operation. Instead the new CTIME will be updated in the IBM Spectrum Protect server without requiring a recall. The downside of excluding ACL from backup is that no ACL are restored during a restore operation. The POSIX permissions of the restored file correspond to the file’s permissions during the last backup operation. The file will have ACL, but these ACL will be very simple and correspond to the POSIX permissions.
When setting client option SKIPACLUPDATECHECK yes along with the option UPDATECTIME yes in the dsm.opt file, then the backup operation backs up the ACL during the first backup operation. If you followed the recommendation to run backup prior to migration, then this will not cause recalls. If the ACL of the files change after the first backup and the files are migrated, then the next backup operation will not cause a recall (or skipped files respectively) and it will not backup the new ACL. It will update the CTIME in the IBM Spectrum Protect server. During restore the initial ACL of the file will be restored.
Attention, when using the client options to skip ACL and update CTIME the restore operation will not restore the current ACL of the files. This can cause access loss to the files for the user. The current ACL must be applied manually by an administrator, either via the SMB share or using IBM Spectrum Scale commands (for example mmeditacl or mmputacl). The main challenge here is figuring out what the current ACL for the files was.
When running into the situation where an existing file system with ACL enabled has been backed up and tiered to tape over a period of time, the client options to skip ACL and update CTIME have not been set and ACL are changed for migrated files, then the next backup operation will end with a warning message and skip migrated files. This assumes that you run mmbackup with the recommended option --skip-migrated. Subsequent backup operations will end with the sane warning message and skipped migrated files. To resolve this situation, perform the following steps:
- Recall the files that were skipped during the backup operation. The list of files that were skipped during the backup is presented with the warning message of the backup operation. Use the tape optimized recall for recalling many files.
- Prevent ACL changes.
- Perform another backup operation to back up the recalled files. This backup operations shall not end with a warning message and shall not recall files because the files that were skipped have already been recalled.
- Set the client options to skip ACL and update CTIME in the dsm.opt file of all backup nodes.
- Allow ACL changes.
- From now on run mmbackup with the option --backup-migrated. This assures that for files where ACL have changed the CTIME update is sent to the IBM Spectrum Protect server.
Ensure that you always run backup before migration and using the options to migrate files if they have a current backup. Otherwise, it may cause files to be migrated that do not have a current backup yet, and the next backup operation will recall these files.#Highlights-home#Highlights