Originally posted by: JOWALTER
by Jörg Walter, Andre Gaschler and Erik Franz
Preface
IBM Spectrum Protect (formerly known as Tivoli Storage Manager / TSM) is made to protect business critical data and applications, requiring continuous availability and disaster protection. High available Spectrum Protect infrastructures in conjunction with ProtecTIER’s data deduplication and replication features lead to minimal RTO/RPO times, combined with maximum space efficiency at the same time.
IBM’s ProtecTIER solution offers great data deduplication and replication features which allows for efficient replication of backup data to offsite locations, without the need to move physical tapes. In addition, the following benefits come along with ProtecTIER:
- Deduplication performance of up to 2.500MB/sec. for Backup and even higher performance for restore operations
- Single system capacity up to 1 PB physical repository size
- LANfree backup and restore capabilities
- Near sync replication for improved RPO
- Support for multi-site replication requirements
In Spectrum Protect and ProtecTIER backup environments, the ProtecTIER deduplicates and replicates the backup data, while Spectrum Protect manages and replicates the backup catalog (meta data), which is stored on an integrated IBM DB2 database.
Combining the IP based replication features of ProtecTIER and Spectrum Protect, it is possible to design a flexible Data Protection environment with multi-site redundancy.
This article describes the setup of a multi-site redundancy backup environment using Spectrum Protect together with ProtecTIER. It is based on the experiences we made during a customer implementation and various tests in the ESCC Mainz Storage Systems Lab.
We will give you a short introduction to DB2 HADR feature and to the ProtecTIER solution. Further on, we’ll explain how a multi-site redundant backup environment based on Spectrum Protect together with ProtecTIER is designed.
What is DB2 HADR?
High Availability Disaster Recovery (HADR) is a data replication feature that provides a high availability solution for DB2 databases.
HADR protects against data loss by replicating changes from a source database (Primary) to a target database (Standby).
The following list describes why you should think about using DB2 HADR in your Spectrum Protect environment:
- HADR is a standard feature of DB2, which is included with TSM beginning with version 6.x., so it is ready to use.
- Using HADR only for DB2 bundled with TSM requires no additional licenses.
- HADR communication is managed by the database, using standard TCP/IP networks, so there are no special requirements regarding disk subsystems or other HW or SW.
- HADR is easy to setup and manage. Only a few commands are required to configure HADR on an existing TSM instance.
- HADR allows to implement cluster features on an application layer, with no need for operating system cluster support.
- HADR supports both, HA and DR scenarios:
- HADR “sync” peers provide warm standby for HA
- HADR “async” peers provide warm standby for DR
- Both variants can be combined in an environment
Starting with DB2 v10.1, up to three HADR standby databases can be setup for a primary database. This feature is available with Spectrum Protect (TSM) v7.1, which contains DB2 v10.5.
One system needs to be designated as “Principal Standby”, while additional standby systems can be added as “Auxiliary Standby”.
All of the HADR sync modes are supported on the principal standby, but the auxiliary standbys’ synchronization mode is always SUPERASYNC mode.
IBM ProtecTIER at a glance
IBM ProtecTIER with Hyperfactor is a software running on Linux, providing in-line data de-duplication features for backup data (e.g. Spectrum Protect, NetBackup, etc.).
Various configurations are available, e.g.:
- Small TS7620 appliance or TS7650G gateway (single node or cluster)
- FibreChannel-attached Virtual Tape Library (VTL) emulation
- Ethernet-attached CIFS, NFS or OST interfaces
IBM ProtecTIER with native IP Replication
IBM ProtecTIER native IP replication provides an option to replicate virtual cartridges (for VTLs) or files (for systems using the File System Interface / FSI) from one ProtecTIER system to another ProtecTIER system via standard TCP/IP networks. Due to a high grade of parallelism, this option allows to move huge amounts of backup data to an offsite location over large distances. Only the deduplicated portion of data is being replicated. The following diagram gives an overview of an IP replication scenario:
Designing a multi-site redundant backup environment
The following diagram shows a three-site replicated backup environment:
- A Spectrum Protect server (HADR primary) has two HADR standby servers. The principal standby is in a second data center in the main location and acts as a failover system e.g. for hardware maintenance purpose. The auxiliary standby acts as a failover system for DR purpose.
- Each server has a ProtecTIER TS7650G (VTL) attached.
- Virtual cartridges are replicated from PT_A to PT_B and PT_C.
The following steps have to be performed to failover TSM and ProtecTIER from DC1 to DC2:
- Failover the Spectrum Protect (TSM) Application:
- Checkout all libvolumes from the VTL library (remove=no, checklabel=no)
- Halt TSM in DC1
- Restart DB2 in DC1 in standby role
- Perform DB2 HADR takeover from DC1 to DC2, monitor peering and finally start TSM in DC2
- Failover the ProtecTIER-based Storagepool(s):
- Update the VTL library definition to “serial=autodetect”
- Delete all Drives and all Paths to the VTL library in TSM (e.g. by using “perform libaction”)
- Re-define the library path and all drives using the proper device names on the failover host
- Enable “DR mode” for the ProtecTIER in DC2 to stop incoming replication traffic from DC1
- Use the PT GUI to move the replicated cartridges to the prepared VTL partition in DC2
- Checkin the libvolumes to the re-defined library in TSM (first checkin scratch, then private)
- Optional: Prepare for continuing operations in DC2:
- All replicated tape cartridges are read-only, which allows to perform restores of data
- In order to perform new backups on the failover site, create new virtual tape cartridges (readwrite)
The failback from DC2 to DC1 is the same procedure, vice-versa.
Summary
DB2 HADR offers a great approach to replicate a Spectrum Protect server database (the “Meta data”) to one or more (standby) target sites. Combined with the native IP replication feature of the IBM ProtecTIER VTL system, it is possible to build easy-to use, efficient, high available, high capacity and high performance backup solutions, which provide superior Disaster protection at the same time.