In this blog article I share some guidance for integration IBM Spectrum Protect for Data Retention (SP-DR) with storage pools provided by NetApp SnapLock® compatible file systems. Even though SnapLock compatible file systems are typically provided by NetApp filers, there are implementations from other vendors that adhere to the SnapLock semantic and can be used with SP-DR.
The drawing below shows the architecture of SP-DR integrating with SnapLock compatible storage. The SP-DR software runs on a Unix or Windows server. The SP-DR server is connected via 10 GBit Ethernet to the SnapLock compatible storage system. The SnapLock storage system provides SnapLock compatible file system (via NFS or POSIX) allowing to manage files in a write-once-read-many (WORM) fashion. The Spectrum Protect storage pools reside in the SnapLock compatible file system.
The archive application stores and retrieves archive objects using the TSM API. Each archive object is bound to a management class and archive copy group that defines the retention policy and periods and the destination storage pool, where the archive object is stored.
Using SnapLock compatible file systems with SP-DR provides an additional layer of WORM protection because with the SnapLock sematic it is not possible to delete SP-DR storage pool volumes stored in SnapLock pools. However, there are some aspects to consider.
How SP-DR works with SnapLock volumes
SP-DR sets the retention on storage pool volumes (container files) in a SnapLock compatible file system based on the RETMIN and RETVER value configured in the archive copy group of the management class. SP-DR uses the append-only capabilities of SnapLock, which allows to append data to a file stored and protected in a SnapLock compatible file system. An append-only file is created by unsetting and setting the write permission on an empty file. A file in append-only mode allows appending data to a file but not overwriting existing data. And it allows to set retention times for append-only files. A file in append-only mode can be set to WORM protected (or immutable) by unsetting the write permission once again. More information about SnapLock append, see here: https://library.netapp.com/ecm/ecm_download_file/ECMP1196889
When SP-DR writes an archive object to a storage pool volume provided in a SnapLock compatible file system, the following steps are performed:
- create an empty file
- make the file read-only
- make the file read-write à this enables the append mode on the file
- append the data à only append is allowed, no overwrite possible
- determine the expiration date based on the RETMIN and RETVER values of the objects in the volume (the greater value is chosen)
- set SnapLock retention for the storage pool volume by setting the last access time to the expiration date
- when the volume is full then set the file read-only and the final expiration date à this makes the file WORM protected
This requires the SnapLock compatible file system to be compatible with the NetApp SnapLock semantic.
Event based copy groups
A special challenge arises when the archive copy group is configured for event-based retention (RETINIT=event). When SP-DR sets the retention on the storage pool volume it cannot account for the event-based retention period, because the retention period of the file is indefinite if the event has not been activated via the TSM API. It may take years before an event is activated, because in SP-DR the event typically triggers the expiration of the file. Instead SP-DR sets the storage pool volume retention to the maximum of RETMIN and RETVER.
If RETMIN and RETVER are both set to 0 in the archive copy group, then the retention period of the storage pool volume is set to 0 days. In this case the minimum retention of the SnapLock compatible file system kicks in. The recommendation is to set the minimum retention to 30 days. This means that the storage pool volume is protected for 30 days which by the way is the reclamation period in SP-DR. This assumes that all files in a storage pool volume have been archived at the same time and in the same archive copy group).
If the archive copy group is configured for chronological retention (RETINIT=chrono) then this challenge does not exist, because there is only the RETVER parameter that dictates the retention period. The retention period configured in the RETVER parameter can be directly set for the storage pool volume at any point of time.
How reclamation works
SP-DR determines reclamation candidates based on the storage pool volume retention period and NOT based on the actual object retention period inherited from the archive copy group configuration. In the scenario where RETMIN and RETVER are 0, the storage pool volume (once full) is candidate for immediate reclamation. The reclamation takes place within a reclamation period of 30 days. When reclamation starts the server determines the reclaimable space based on the object retention. Objects where the retention period is expired become reclamation candidates. The retention period of the objects stored in the storage pool volume may still be indefinite with event-based retention. Those objects do not become reclamation candidates.
If the server does not find sufficient candidates in accordance to the reclamation threshold then the storage pool volume is not reclaimed. The server instead extends the retention period of the volume with the value configured in the RETENTIONEXTENSION option. The new retention period is calculated by adding the number of days specified with RETENTIONEXTENSION to the current date.
If the server identifies sufficient reclamation candidates in accordance to the reclamation threshold, the subject storage pool volume is reclaimed. The server sets the retention period of the target volume to the greater of these values:
- The remaining retention time based on RETMIN and RETVER of the data (this might still be zero), plus 30 days for the reclamation period.
- The RETENTIONEXTENSION value plus 30 days for the reclamation period.
If the RETENTIONEXTENSION is default (365 days) then the storage pool volume retention period is extended every year for another year, if no data has expired. If data expires in the volume (e.g., event has been activated) and the reclaimable space is greater than the reclamation threshold, the volume is reclaimed and the target volume gets a retention of 365 days, and so on.
General configuration recommendations
Find below some SP-DR server parameters that should be considered and adjusted:
NOMIGRECL should be disabled (not set)
The Reclamation Threshold should be set to a reasonable value in the storage pool
RETENTIONEXTENSION should be set to the default (365 days)
RETMIN and RETVER should reflect the objects actual retention period
- the storage pool volume retention is set to max(RETMIN,RETVER) relative to the archive date
- if RETMIN=RETVER=0 then the storage pool volumes retention is set to the value configured in the RETENTIONEXTENSION option during reclamation
Configure archive group copy groups with identical retention setting (RETINIT, RETMIN, RETVER) to the same destination storage pool pools provided by a SnapLock compatible file system
For storage pool volumes provision by NetApp SnapLock set the following values in the NetApp SnapLock configuration:
- Minimum retention: 30 days (enforced by SnapLock if last access date < 30 days from now)
- Default retention: 30 days (enforced by SnapLock if no last access date is set)
- Maximum retention: 30 years (enforced by SnapLock if last access date > 30 years from now)
When exporting to a SP-DR server with storage pools configured on SnapLock compatible file systems then the data cannot be directly imported directly into target SnapLock storage pool, because the import process does not set the SnapLock retention accordingly (only RETENTIONEXTENSION). The import must be directed into an ordinary FILE or DISK pool. The SnapLock pool must be defined as NEXTPOOL allowing imported data to be migrated to the SnapLock pool. The following limitations apply:
- the ordinary FILE or DISK pool MUST have volumes defined
- the ordinary FILE or DISK pool must have the same name as on the source
- SnapLock pools as NEXTPOOL of the disk pool must have different names
- at the end of the migration the disk pool can be migrated completely, deleted, the SnapLock pool can be renamed and the copy group destination can be set to the SnapLock pool.
Find more information about SnapLock with SP-DR in the Spectrum Protect Knowledge Center:
The integration and configuration of IBM Spectrum Scale immutable fileset with IBM Spectrum Protect for Data Retention is further described in this publication:
SnapLock® is a registered trademark of NetApp Inc.
Microsoft, Windows, Windows NT and the Windows logo are registered trademarks of Microsoft Corporation in the United States and/or other countries.
UNIX is a registered trademark in the United States and other countries licensed exclusively through The Open Group.
The registered trademark Linux® is used pursuant to a sublicense from the Linux Foundation, the exclusive licensee of Linus Torvalds, owner of the mark on a worldwide basis.