๐งณ Packing Too Much in One Suitcase?
Co-authors: Keigo Matsubara, Storage Technical Specialist; David Bohm, IBM Storage Protect Development
๐ Background
Are you tired of running into storage capacity issues with your IBM Storage Protect servers? You're not alone! Many users have expressed the need for a way to seamlessly rebalance nodes between multiple instances to avoid these headaches. Currently, IBM Storage Protect lacks virtual clustering, forcing a Backup/Archive (B/A) client to be tightly coupled with a specific server instance. But don't worry, we've got a solution to help you mitigate this situation and keep your storage running smoothly.
๐ Rebalancing Nodes on IBM Storage Protect Servers
๐ ๏ธ Considerations
โ
Assumptions and Pre-requisites
-
Identifying Nodes for Migration: Use the admin command โQuery occupancyโ to pinpoint nodes that can reduce storage space occupation on ServerA.
Deduplication percentage for a particular node may also be considered in the calculation since a high deduplication percentage could indicate little value in moving a node, because even if you move the node, the chunks shared with backup objects from other nodes would still stay back on the source Storage Protect server. For container storage pools for a particular node, GENERATE DEDUPSTATS and QUERY DEDUPSTATS admin commands can be used to help with this determination. Note: These are just guidelines to help take the decision, however selection of nodes for migration should be determined by the user.
-
Backup Downtime: Users must agree to a backup-downtime window during node migration. The backup down-time would be shorter if the delta between the two servers for the data & metadata related to the node is smaller.
-
Replication Setup: Ensure replication between ServerA and ServerB is already established and runs regularly.
-
Free Space on ServerB: ServerB must have sufficient free space to host the node's data going forward.
-
BA Client Connection: Nodes must connect to the SP Server using BA Client. Nodes using other Storage Protect clients such as Data Protection for Microsoft SQL Server, Data Protection for VMware, and so on, are not in scope.
-
Replication to ServerC: Optionally, set up replication from ServerB to ServerC for data protection and redundancy.
-
Authentication: LDAP/AD authentication is not used for node/SP client authentication.
๐ Workflow
๐ง Preparation
-
Lock the Node: Stop backups/restores on the node by locking it on ServerA.
-
SSL Certificate Check: Ensure the node can connect to ServerB without SSL certificate errors.
-
Password Access: Verify if node can connect with passwordaccess generate setting in the dsm.opt/dsm.sys file.
-
Sync Servers: Ensure ServerA and ServerB are completely synchronized.
-
Deactivate Original STGRULE: Stop replicating data from ServerA to ServerB for all nodes.
-
Define New Storage Rule: Create a new storage rule to replicate data for the specific node from ServerA to ServerB.
-
Start Replication Rule: Begin the replication rule with 'forecereconcile=yes'.
๐ Data Integrity and Inventory
-
Inventory Expiration: Perform inventory expiration on both servers for the specific node.
-
Object Count: Check object counts on both servers for the specific node.
Run select against backup_objects, archive_objects, spaceman_objects joining with replicated_objects table, for objects from the given NODEID, to see if there are objects still missing replication.
After inventory expiration process is complete on both the servers, check object count on both source and target replication servers, in backup_objects, archive_objects, spaceman_objects tables for the specific node:
-
Unresolved Chunks: Ensure there are no unresolved chunks on ServerB. Use admin command โSHOW UNRESOLVEDCHUNKSโ to check for this.
-
Replication Groups: Check the status of in-flight replication groups. Admin command โSHOW REPLGROUPโ can be used to get this information.
-
Retention sets: Note that retention sets do not get replicated.
๐ Switchover Clients
-
Update Configuration: Change the dsm.opt/dsm.sys file to point to ServerB as the primary server.
-
Define Schedule Associations: Set up schedule associations on ServerB like ServerA.
-
Restart Services: Restart all client services and demons.
-
Conduct Tests: Ensure the client operates correctly and backup/restore functions work as expected.
๐งน Cleanup Instructions
๐๏ธ On Source ServerA
-
Delete STGRULE: Remove the replication rule for the node from ServerA to ServerB.
-
Update Node: โSet REPLState=disabledโ for the migrated node.
-
Remove Replnode Definition: Remove the replication node definition for the given node.
-
Decommission Node: Decommission the node on ServerA.
๐๏ธ On Target ServerB
-
Remove Replication Relationship: Remove the replication relationship for the node associated with ServerA.
-
Resume Original STGRULE: Activate the original STGRULE on ServerA.
๐ Final Thoughts
By following these steps, you can effectively rebalance nodes between IBM Storage Protect servers, ensuring optimal performance and avoiding storage capacity issues. This process not only helps in managing storage more efficiently but also provides a robust mechanism for data protection and redundancy.