Data Protection Software

 View Only

LAN-free and Server-free backup to disk with IBM Spectrum Protect™ and IBM Spectrum Scale™

By Nils Haustein posted Sat February 29, 2020 11:05 AM

  

By Nils Haustein, Joerg Walter and Andre Gaschler

Introduction

LAN-free backup has the potential to accelerate the backup and recovery performance by leveraging dedicated storage networks such as Storage Area Networks (SAN). Especially the backup performance of large files – such as database and media files – can be improved compared to classical Local Area Network (LAN)-based backup.

 What is LAN-based, LAN-free and Server-free backup?

In a typical backup solution with IBM Spectrum Protect the backup client performs LAN-based backups to the backup server (Figure 1a). This means, the backup client sends data to the backup server over the LAN and the backup server stores the data on a storage device which is typically a disk or a tape.

Figure 1: Backup options

Figure 1a-1c: LAN-based, LAN-free and Server-free backup architectures


With LAN-free backup the backup client sends the data directly to the storage device of the backup server without sending it over the LAN (Figure 1b). The LAN connection to the server is only required to request access to the storage pool and to send the backup metadata that is managed by the backup server. LAN-free backup is typically used with tape as storage device characterized by high sequential throughput. The performance can be scaled linearly by adding tape drives and making them accessible to the backup client via the appropriate Storage Area Networks (SAN). This assumes that the backup client can initiate enough sessions to keep all tape drives running.

A special form of LAN-free backup is Server-free backup. With Server-free backup the client has two (logical) LAN connections: one to the backup server and one to the LAN-based storage device (Figure 1c). The connection to the server is used to request access to the storage pool and to send backup metadata that is managed by the server. The LAN connection to the storage device is used to store the backup data directly. This method is called Server-free because the storage device access is accommodated over LAN.

Beside tape drives, IBM Spectrum Protect supports LAN-free backup to disk (and Server-free backup respectively) with IBM Spectrum Scale as destination. Spectrum Scale is a scalable and parallel file system providing comprehensive storage services [1]. Using Spectrum Scale as storage device for Spectrum Protect does not only facilitate LAN-free backup to disk, but also provides better storage utilization and transparent salability both resulting in lower cost [2].

In this article we explain the architecture of a LAN-free and Server-free backup solution to disk for IBM Spectrum Protect with IBM Spectrum Scale and demonstrate the configuration of a LAN-free backup solution.

Architecture

In this section we explain the basic architectures for LAN-free and Server-free backup of IBM Spectrum Protect with IBM Spectrum Scale. First some background regarding the Spectrum Scale architecture. As shown in figure 2, there are different types of nodes within a Spectrum Scale cluster:

  • Nodes with direct access to storage LUN over a storage network
  • Nodes with no direct access to storage LUN

Figure 2: Spectrum Scale ArchitectureFigure 2: IBM Spectrum Scale architecture

  In Spectrum Scale a storage LUN is called Network Shared Disk (NSD). A NSD is the Spectrum Scale representation of a storage LUN that is used to store file system data. Nodes with direct access to a storage LUN can be designated NSD servers. The designation of a node as NSD server for a particular NSD (or LUN) is configured in the NSD properties when the NSD is created. A designated NSD server can provide access to the storage LUN for nodes with no direct access to the storage LUN. These nodes are  called NSD clients and have no direct access to a storage LUN. They can access the storage via designated NSD servers over Ethernet or Infiniband networks.

In a shared storage architecture - where all Spectrum Scale nodes have access to all storage LUN via the storage network - a node with direct access to a storage LUN does not have to be a designated NSD server (Figure 2, node to the right). Such a node can access the storage LUN over the storage network but does not provide access for NSD clients. The advantage of designated NSD servers in a shared storage architecture (Figure 2, node to the left) is that it provides a redundant I/O path: if a node with direct access to the storage LUN cannot access the storage LUN via the storage network then it can access the NSD via the designated NSD server over the cluster network.

 

LAN-free backup

One prerequisite for LAN-free backup with Spectrum Scale is that the Spectrum Protect backup server and backup client are part of a Spectrum Scale cluster and have direct access to the storage devices (LUN) over a Storage Area Network (SAN). This requires the Spectrum Scale cluster to be configured in the shared storage topology where all cluster nodes have access to all storage LUN. Figure 3 shows a typical setup where the Spectrum Protect server and client are cluster nodes and have shared access to the Spectrum Scale storage:
Figure 3: Lan-free backup architectureFigure 3: LAN-free backup architecture of Spectrum Protect with Spectrum Scale


As shown in figure 3 the Spectrum Protect server runs on a Spectrum Scale node with access to the storage LUN presented by the storage system. The Spectrum Protect client also runs on a Spectrum Scale node and has direct access to all storage LUN configured for the Spectrum Scale file system. Consequently the Spectrum Protect client can be configured for LAN-free backup. Whether these nodes are designated NSD servers is an architectural decision and is not discussed herein.

 

Server-free backup

The Server-free backup is based on the Spectrum Scale client-server topology where the Spectrum Protect server and client are configured on Spectrum Scale client nodes. As shown in figure 2, a Spectrum Scale client node has no direct access to the storage device. It accesses a Spectrum Scale NSD server for storing data in the file system. Figure 4 shows a typical setup where the Spectrum Protect server and client are Spectrum Scale client nodes that are connected to the Spectrum Scale NSD server:
Figure 4: Server-free backup architectureFigure 4: Server-free backup architecture of Spectrum Protect with Spectrum Scale


As shown in figure 4 the Spectrum Protect server and client run on Spectrum Scale NSD client nodes. These client nodes are connected via the cluster network that is based on Ethernet or Infiniband to the Spectrum Scale NSD servers. Spectrum Scale NSD clients mount all file systems where the backup data is stored. When a Spectrum Scale client stores data, it accesses the NSD provided by the NSD server through a block based protocol (the so called NSD-protocol) and the data is stored on the underlying storage LUN. Because the Spectrum Protect client has direct access to the Spectrum Scale NSD server, it can directly store the backup data through the NSD server on the storage device. The communication between the Spectrum Protect client and NSD server is facilitated over Ethernet or Infiniband and is therefore called Server-free because it does not involve the Spectrum Protect server for storing backup data (only the backup metadata is transferred to the Spectrum Protect server). 

How LAN-free and Server-free works with Spectrum Scale

As shown above the prerequisite for LAN-free or Server-free backup to disk is that the Spectrum Protect server and client are member of the same Spectrum Scale cluster and have access to the same Network Shared Disk (NSD) of the file system where the backup data is stored (storage pool file system). The Spectrum Protect storage pool is configured in the Spectrum Scale file system for shared access (see section Configuration) allowing the Spectrum Protect client to backup data directly to the storage device (LAN-free) or NSD server (Server-free).

The Spectrum Protect client consists of the Backup/Archive (B/A) client and the Storage Agent (Spectrum Protect for Storage Area Networks). Figure 5 shows the general process for LAN free backup:
Figure 5: LAN-free backup processFigure 5: LAN-free backup process


When the B/A client requires to backup data, it invokes the storage agent who requests a storage pool volume from the Spectrum Protect server via the LAN (step 1). The Spectrum Protect server allocates and mounts a storage pool volume in shared mode (step 2) and returns the path to the storage pool volume to the storage agent (step 3). The B/A client now sends the backup data directly to the storage pool volume over the SAN (step 4). Hence it does not send the backup data through the Spectrum Protect server via LAN.

The Spectrum Protect storage pool volume is a file in the Spectrum Scale file system. Because the Spectrum Protect client runs on a Spectrum Scale cluster node with direct access to the storage LUN, the Spectrum Protect client writes to all storage LUNs configured for this file system in parallel. It can leverage the available SAN bandwidth and performance of the configured storage LUN.

The Server-free backup works in a similar way. The difference is that the Spectrum Protect client cannot access the storage LUN via the storage network directly (see figure 4). It sends the backup data to the NSD servers via the cluster network. To accelerate the performance the Spectrum Scale cluster node running the Spectrum Protect client can be equipped with an additional high performance network connection to the NSD servers. This extra network can be based on Infiniband allowing the Spectrum Protect client high I/O performance with the NSD servers.

In both cases, the Spectrum Protect server stores the backup metadata for the backup operation obtained from the Spectrum Protect client via the LAN. 

Configuration

In this section we provide an example for configuring LAN-free backup of an existing Spectrum Protect and Spectrum Scale installation. Figure 6 shows the schematic setup being used in this example:
Figure 6: Example setup for LAN-free backup to disk with Spectrum ScaleFigure 6: Example setup for LAN-free backup to disk with Spectrum Scale


The Spectrum Scale cluster comprises two nodes: one node running the Spectrum Protect server and one node running the Spectrum Protect client including the Storage Agent (STA) and the B/A client. The Spectrum Protect server name is TSM1, the Spectrum Protect Storage Agent name is STA1. The Spectrum Protect server and client are readily configured.  

The Spectrum Scale cluster is configured in a shared storage architecture where all nodes have access to all storage LUNs and file systems. Four file systems exist, whereby the file system /tsm/stg is used for shared access and LAN-free backup.

File system name

File system content

/tsm/db

Spectrum Protect DB

/tsm/log

Spectrum Protect Active Logs

/tsm/instance

Spectrum Protect instance

/tsm/stg

Spectrum Protect storage pools and archive log

 

The Spectrum Scale cluster is started and active on all nodes and all file systems are mounted accordingly. In addition the Spectrum Protect server is up and running. The STA is not started yet, as it needs to be registered to the Spectrum Protect server first.

To configure LAN-free backup for the Spectrum Protect client with the STA named STA1 and the Spectrum Protect server named TSM1 the following steps are required:

If not already done, set a global server password on Spectrum Protect server (TSM1):

SET SERVERPA=<TSMSRV-password>


Register the STA in the Spectrum Protect server:

DEFINE SERVER STA1 hla=sta1.acme.com lla=1500 serverpa=<STA-password>


Define a shared device class using the storage pool file system. To accommodate more storage pools within the /tsm/stg file system, the shared device class points to directory /tsm/stg/shared:

DEFINE DEVCL DC_LANFREE_STG DEVT=FILE DIR=/tsm/stg/shared MAXCAP=128G MOUNTL=3 shared=yes

When defining a shared file device class, the Spectrum Protect server automatically creates a "FILE library", which will have the same name as the device-class (DC_LANFREE_STG). In addition a number of "FILE drives" where the number of “FILE drives” corresponds to the MOUNTLimit parameter of the device class, which is set to 3 in the example above. The name of the drives are then derived from the name of the device class with a number appended. In this example the names of the “FILE drives” created by the Spectrum Protect server are: DC_LANFREE_STG1, DC_LANFREE_STG2 and DC_LANFREE_STG3.

Now the path to the all drives needs to be defined for the Storage Agent STA1. In this example the MOUNTLimit is set to 3, so we have to define paths for 3 drives. Each path points to our shared storage pool directory:

DEFINE PATH STA1 DC_LANFREE_STG1 library=DC_LANFREE_STG srct=server destt=drive device=file directory="/tsm/stg/shared"
DEFINE PATH STA1 DC_LANFREE_STG2 library=DC_LANFREE_STG srct=server destt=drive device=file directory="/tsm/stg/shared"
DEFINE PATH STA1 DC_LANFREE_STG3 library=DC_LANFREE_STG srct=server destt=drive device=file directory="/tsm/stg/shared"


In order to perform backup operations a storage pool needs to be defined that is using the shared device class. In this example the storage pool has the name “LANFREE”

DEFINE STG LANFREE DC_LANFREE_STG pooltype=primary maxscr=20


In the client option file of the Spectrum Protect client (dsm.sys) the following options needs to be set in order to enable LAN-free:

ENABLELANFREE YES

 

When the STA is registered to the Spectrum Protect server, it can be configured and started on the backup client node:

Register the device configuration file in the STA option file:

vi /opt/Tivoli/tsm/StorageAgent/bin/dsmsta.opt
DEVCONFIG devconfig.out


Configure the storage agent:

./dsmsta setstorageserver myname=STA1 mypassword=<STA-password> myhladdress=sta1.acme.com servername=TSM1 serverpassword=<TSMSRV-password> hladdress=tsm1.acme.com lladdress=1500


Configure the automatic start of the storage agent and start it (this example is for RHEL 6):

cp /opt/tivoli/tsm/StorageAgent/bin/dsmsta.rc /etc/init.d
chkconfig dsmsta.rc on
service dsmsta.rc start

 

Once the STA is started you can use the VALIDATE LANFREE command to check if everything is configured properly and which management classes can be used for LAN-free data transfer.

Server-free backup is setup in the same way. The prerequisites are that the Spectrum Protect server and client are part of the same cluster and have the shared storage pool file system mounted.

Spectrum Scale considerations

In order to leverage the performance advantages with LAN-free and Server-free backup the network between the Spectrum Protect client and storage (NSD server respectively) must provide the required bandwidth. In a LAN-free environment this is easy to implement by dedicating SAN resources for backup. In a Server-free environment the network between the Spectrum Protect client and NSD server is based on Ethernet or Infiniband and may become the bottleneck. Consider to attach an additional network between the Spectrum Protect client and the Spectrum Scale NSD servers, in addition to the cluster network (figure 4). If the additional network is a high-speed network it facilitates high I/O performance.

Also the storage system must be designed to provide the required bandwidth. Consider other I/O activities induced by the Spectrum Protect DB and housekeeping operations. Using Flash for LAN-free backup might be suitable to provide a high and scalable bandwidth.

Certain Spectrum Scale configuration parameters on the Spectrum Scale node running the Spectrum Protect client can be leveraged to tune the LAN-free backup performance. Find below a short explanation of some parameters in the context of LAN-free backup for more details about tuning parameters see [3].

Spectrum Scale parameter

Note

File system block size

Larger block sizes for the storage pool file system are typically better for sequential I/O. The file system block size should be aligned to the stripe-size of the underlying RAID-arrays. With ESS the block size can be between 2 – 8 MB.

pagepool

Internal memory used for read- and write caching. As more cache memory as better, it should not exceed 50% of the system memory.

maxFilesToCache

Maximum number of files that can be cached at a time. Depends on number of files being backed up in one backup session and the number of parallel backup sessions.

maxMBpS

Guides the number of caching threads being scheduled. May be set two times the throughput of the storage.
Note, if a storage LUN is represented by large number of disk (e.g. with Spectrum Scale RAID) set ignoreprefetchLUNCount=yes

prefetchThreads

Maximum number of caching threads responsible for read, depends on the number of parallel restore session.

worker1Threads

Maximum number of caching threads responsible for write, depends on the number of parallel backup session.

prefetchPct

Guides the usage of memory (pagepool) for caching, higher values are typically beneficial

nsdSmallThreadRatio

Ratio for small and large I/O queues, should be set to 1

nsdThreadsPerQueue

Number of threads per queue

nsdMaxWorkerThreads

Number of NSD worker threads, should be according to number of NSD available to the storage pool file system and IOPs possible for each NSD.

subnet

Allows the definition of additional subnets between Spectrum Scale NSD clients and servers.

 

Note, the parameters above should be tuned with care and only by Spectrum Scale experts.

Limitations

The following limitations apply

  • LAN-free backup to disk is only supported with Spectrum Scale.
  • We have tested LAN-free backup only with Spectrum Protect server and client being part of the same cluster. We have not tested cross-cluster mounts. This may not be supported today.
  • LAN free backup is not possible with container based storage pools (inline deduplication).
  • Deduplication is only possible with legacy deduplication on FILE volumes that can be setup for shared access via the device class definition (server-side deduplication only).

 

References:

[1] Redbook with a comprehensive overview to IBM Spectrum Scale
http://www.redbooks.ibm.com/abstracts/sg248254.html?Open  

[2] Spectrum Protect with Spectrum Scale Introduction
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS5334

[3] Spectrum Scale tuning parameters
https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20%28GPFS%29/page/Tuning%20Parameters

0 comments
50 views

Permalink