Db2

 View Only

Expanding an IBM Smart Analytics System database and redistributing data

By Rahul Kumar posted Fri December 02, 2022 08:32 AM

  

Executive Summary

Data warehouse environments continue to experience an explosion in data growth. As a result, you might need additional storage capacity to cope with increased enterprise demands. To help you meet these demands, the IBM® Smart Analytics System is a flexible data warehousing solution that supports a building-block approach to expansion. An important follow-up activity to that expansion is ensuring that you redistribute the data across all database partitions.

To add storage capacity to an IBM Smart Analytics System, you scale out the data warehouse by adding a new data module, across which you expand the existing database. Naturally, you then must redistribute the existing data across the database partitions. This activity reduces the amount of storage that is used on each data module and allows for the continued growth of the database across the entire data warehouse.

The objective of adding storage capacity is to accommodate additional data in the database. You might initially experience an increase in performance because each database partition has proportionally less data to process than before the expansion. However, as you continue to add data, performance levels will revert to what they were prior to the expansion. Adding storage capacity does not make the server go faster.

This paper recommends best practices for the process of expanding the database and re-distributing data following the hardware build, installation, and cluster expansion (1). This paper is perhaps more prescriptive than other best practice papers in that it is also an end-to-end guide of the entire redistribution process. Using the approach recommended in this paper for redistributing data entails an outage on the database, so it is essential to get it done as efficiently as possible and to avoid any problems along the way. This paper does not cover manually repartitioning data online in a database.

The key to ensuring a successful database expansion and data redistribution is good planning. This paper covers the prerequisite steps for a successful, timely redistribution operation, such as preparing your database and gathering information. This paper is accompanied by a set of scripts that gather much of the information that you need and provides instructions on how to interpret that data, for example, to estimate how long redistribution will take. Another important factor is tuning the performance of the redistribution operation. By making temporary configuration changes, such as increasing your sort heap and utility heap sizes and by using the recommended command syntax, you can improve the speed of the operation.

(1) For a detailed discussion on identifying, planning, and preparing for an expansion of your IBM Smart Analytics System, see the Expanding an IBM Smart Analytics System paper (http://www.ibm.com/developerworks/data/bestpractices/expandingsmartanalytics/index.html).

Introduction

This paper deals with two distinct processes: expanding the DB2 data warehouse database across a new data module and redistributing data across the expanded database.

The first section of the document introduces the concept of database expansion and data redistribution and what you must consider when planning a database expansion. The second section of the paper describes how you can use the REDISTRIBUTE command with parameters that help maximize performance as well as provides an overview of the scripts. The third section is a detailed guide to database expansion and data redistribution. The appendixes contain sample output and further information to help ensure a successful process.

Two scripts are available with this paper for download from the developerWorks site. These scripts help in the pre-analysis and post-analysis of the database, and you should use them to help prepare and plan for the redistribution process. The scripts and the output from these scripts are referenced in this paper.

Use the command examples in this paper as a guideline. To help avoid problems, always create scripts to execute the commands and test all scripts that you use. Also, use utilities such as nohup or screen so that a connection failure does not cause a command or script to fail.

Download the full report for more on Expanding an IBM Smart Analytics System database and redistributing data.
Download the report to get started!

#Db2
0 comments
5 views

Permalink