Storage Management and Reporting

 View Only

Comparing duration of optimized and normal recalls with IBM Spectrum Protect for Space Management

By Nils Haustein posted 8 days ago

  

This article presents test results comparing the duration of tape optimized and normal recalls of several files at different sizes with IBM Spectrum Protect for Space Management integrated with IBM Spectrum Scale. For each test the file size was homogeneous.

Here is a summary of the test results:


These test results clearly indicate that optimized recalls have a significant shorter duration, especially for large numbers of smaller file sizes. Normal recalls are challenging in a tiered storage file system, as discussed in the blog article “Best practices for managing files within tiered storage file systems with tape” (https://community.ibm.com/community/user/storage/blogs/nils-haustein1/2020/01/14/managing-files-in-tiered-storage). The test results presented here underpin the need to prevent normal recalls and use tape optimized recalls instead.

Before presenting the test setup and results in detail, a short introduction to the solution architecture integrating IBM Spectrum Scale with IBM Spectrum Protect for Space Management is presented.

 

Introduction to IBM Spectrum Scale integrated with IBM Spectrum Protect for Space Management

IBM Spectrum Scale is a clustered parallel file system that can be used to store files on the most appropriate storage tier based on Information Lifecycle Management (ILM) policies. As shown in the picture below, the IBM Spectrum Scale file system is comprised of file system pools, each pool represents a storage tier. There are internal pools and external pools. An internal pool uses block storage devices such as disk, solid state disk and flash that are attached to the IBM Spectrum Scale cluster nodes. There can be multiple internal pools, each using different storage media. An external pool is represented by an interface program that migrates and recalls files to an external storage.

The IBM Spectrum Protect for Space Management client provides an interface program and can be configured as an external pool in IBM Spectrum Scale file systems. With this configuration, IBM Spectrum Protect for Space Management can migrate and recall files from internal pools of the IBM Spectrum Scale file system to one or more IBM Spectrum Protect servers. In many cases the migrated files are stored on tapes that are attached to the IBM Spectrum Protect server.

After a file is migrated by the IBM Spectrum Protect for Space Management client, the file is still visible in the file system, however the file content resides in storage managed by the IBM Spectrum Protect server. When the user of the file system access the migrated file, then the IBM Spectrum Protect for Space Management intercepts this request and recalls the file from the IBM Spectrum Protect server and places it on the internal pool.

There are two types of recalls with IBM Spectrum Protect for Space Management: normal recall and tape optimized recall. A normal recall is triggered when a migrated file is normally accessed in the file system. If the file is stored on a tape in the IBM Spectrum Protect server, then the server loads and positions the tape and sends the file data to the space management client. The space management client writes the data back to the internal file system pool. When multiple migrated files are accessed at the same time, then multiple independent normal recalls are triggered, each requiring a tape to be loaded, located and a file to be read. If multiple files are on the same tape, then these files are not recalled in the order they are stored on tape. This causes tape start – stop operations which are usually time consuming. In addition, normal recalls can cause chaotic tape load operations because files are not sorted by their tape-ID. This is even more time consuming.

Tape optimized recalls are much faster and tape resource gentle. The tape optimized recall is triggered by an administrative command that includes a list of file names to be recalled. This command is passed to the space management client that sorts these file names by the tape-ID and the position on tape. Afterwards it mounts the required tapes and copies the files back to disk in parallel. Hence, all files that are on one tape are copied back in the order they are stored on tape. In addition, it prevents chaotic tape loads because each tape that is required is loaded and processed only once.

The test result presented in below confirm that optimized recalls are faster and resource gentle.

Test setup

The test environment included a 4-node IBM Spectrum Scale cluster with one node dedicated for the IBM Spectrum Protect for Space Management operations. The table below shows the key properties of the IBM Spectrum Scale cluster hosting the IBM Spectrum Protect for Space Management node:

Operating System version

RHEL 8.4

Spectrum Scale version

5.1.1.2

Spectrum Protect for Space Management client version

8.1.12.1

Network connection client - server

10 GbE

File system block size

256k

Baseline file system performance

~2GB/sec

 

The IBM Spectrum Protect for Space Management node was connected to the IBM Spectrum Protect server via 10 GbE network. The space management pool was configured as tape pool, with no intermediate disk pool. The tape pool contained one LTO-8 tape. The table below shows the key properties of the IBM Spectrum Protect server environment:

Operating System version

AIX

Spectrum Protect server version

8.1.13.000

Tape technology

1 x LTO-8

Tape library

TS4500

Test methodology

Several files with the same size and random content were stored in one directory of the space managed IBM Spectrum Scale file system. There have been multiple directories, each containing several files with the same size.

All files subject for recall were migrated to a single tape in sequence using the following IBM Spectrum Protect for Space Management client command:

# dsmmigrate -d /path-and-filename/

 

The recalls were executed using the IBM Spectrum Protect for Space Management client dsmrecall command. Duration of the recall commands was measured with the time command, for example:

# time dsmrecall -d -filelist=filelist /hana

The real time value was used to determine the duration of the command.

Each test was executed multiple times and the mean was determined.

After one test was completed, the tape was dismounted from the tape drive. Each test included the time for mounting and positioning the tape.

Optimized recall processing

The optimized recall of several files with the same size (in one directory) was executed as tape optimized recall using the following IBM Spectrum Protect for Space Management client command:

# time dsmrecall -d -filelist=filelist /hana

The file list contained the fully qualified path and file names of all files in this directory.

Normal recall processing

Normal recall of single files was executed using the following command:

# time dsmrecall -d /path-and-filename

 

Normal recalls of files with the same size in the same directory were executed in parallel. For each file one dsmrecall-process was started in the background and completion was monitored in a loop. For this purpose, a script was created executing the following workflow:

# $flist is the file list provided to the script
cat $flist | while read line;
do
  # for each file name in the list start a recall in background
  dsmrecall -d $line 2>&1 >> /dev/null &
done
 # monitor the completion of all recall processes
echo "Info: waiting for recalls to complete"
exists=$(ps -ef | grep "dsmrecall -d" | grep -v "grep")
while [[ ! -z $exists ]];
do
  exists=$(ps -ef | grep "dsmrecall -d" | grep -v "grep")
  sleep 1
done

The script was invoked with the following command:

# time ./trecall.sh filelist

The filelist contained several files with the same size located in the same directory.

For test cases with more than 100 files (e.g., 1000 x 10MB files), the file list was broken down in sub-lists, each list containing 100 files. The recall workflow recalled 100 files at a time and when this was finished, then the next sub-list of 100 files was processed immediately. The duration for processing all sub-lists was measured with the time-command.

Test results

The table below summarizes the duration and transfer rates of the recall tests, comparing the duration of optimized recalls with normal recalls. These metrics includes the time required for mounting and positioning the tape.

Test scenario

Optimized recall

Normal recall

Duration in sec

Throughput in MB/sec

Duration in sec

Throughput in MB/sec

Recall of 100 x 4KB files

101

0,004

142

0,003

Recall of 1024  x 4KB files

118

0,03

769

0,01

Recall of 1000 x 10MB files

150

66,56

8893

1,12

Recall of 100 x 1GB files

343

298,54

3400

30,12

Recall of 20 x 10GB files

691

296,38

1630

125,64

 

The following section show the comparison of the duration of optimized vs. normal recalls.

Recall 100 x 4 KB files

Here are the results for recalling 100 x 4 KB files:

Recall 1024 x 4 KB files

Here are the results for recalling 1024 x 4 KB files:

Recall 1000 x 10 MB files

Here are the results for recalling 1000 x 10 MB files:

Recall 100 x 1 GB files

Here are the results for recalling 100 x 1 GB files:

 

Recall 20 x  10 GB files

Here are the results for recalling 20 x 10 GB files:


 

Disclaimer

The information contained in this documentation is provided for informational purposes only. While efforts were made to verify the completeness and accuracy of the information provided, it is provided “as is” without warranty of any kind, express or implied. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, this documentation or any other documentation. Nothing contained in this documentation is intended to, nor shall have the effect of, creating any warranties or representations from IBM (or its suppliers or licensors), or altering the terms and conditions of the applicable license agreement governing the use of IBM software.

The performance data contained herein was obtained in a controlled, isolated environment. Actual results that may be obtained in other operating environments may vary significantly. While IBM has reviewed each item for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere.

 

0 comments
9 views

Permalink