Db2

 View Only

Disk Management in Db2

By Justin Sobieski posted Wed April 24, 2019 05:24 PM

  

In this article, we will take a high level look at Db2’s implementation of disk management, the topics and terminology related to that subject, and some effective ways of helping manage it as a database administrator. This article will include explanations of varying depth on all of the following subjects in Db2:

  • Hierarchical data storage subdivisions
  • Instances
  • Tablespaces
  • Containers
  • Extents
  • Pages
  • Tables
  • Storage Groups
  • Automatic storage

 

A comprehensive understanding of all of the relevant mechanisms involved in Db2’s disk management is absolutely essential to being able to manage a database environment optimally. Even though much of the work is done automatically behind the scenes, and more and more so with each new version released, Db2 offers extensive tools available to database administrators to manually adjust every possible parameter as they see fit.

Db2: Disk Management Hierarchy

              Let us first discuss the general hierarchy of how db2 manages data from a conceptual standpoint.

Db2 organizes all of it’s databases and database management systems within an Instance. An Instance is basically just an environment set aside to process and work only with the databases within that instance. Making multiple instances on the same server is common, giving each server it’s own unique environment and instance username identifier.

 db2_work_2019_pic_3.png

Disk management is further subdivided at the tablespace level. A tablespace is just a particular amount of disk space, or a set of volumes on disk(s), whose space is set aside upfront for storing and managing the data of the tables that reside within that tablespace. Or in other terms: Tablespaces are physical locations on disk that store database objects and are considered a sequential set of pages.

Tablespaces themselves are further subdivided into CONTAINERS, for even more flexibility and control.

Finally within these containers are where the actual data inside tables is stored.
db2_work_2019_pic_1.png

Finally the actual data of a table is broken up into EXTENTS, each of which hold a set number of PAGES.

The sizes of all these parameters can be set manually, usually by updating Database Configuration environment variables. Setting the correct PAGESIZE and EXTENTSIZE can have drastic effects on performance.

              Now, let’s get into the general topic of tables, some of their important terminology, tablespaces, and the distinction between DMS and SMS.

Table Basics and Terminology

  • Tables are logical structures maintained by the database manager and are stored on a physical media such as disk
  • All tables are comprised of rows and columns
  • A column is a set of values with the same data type or definition.
  • At the intersection of a row and column being a value.
  • The order of rows in the table generally cannot be controlled and is determined by the database manager.
    • DB2 supports ordered tables where the DB2 attempts to maintain the rows in the table in ascending or descending order, but order is not guaranteed.
  • There is no “row number” associated with each row. A column in the table can contain a unique row number but DB2 does not maintain any row number for sequential rows in a table.
  • For ROW Organized tables:
    • Rows of data are stored in PAGES inside a TABLESPACE
    • Each row in a page is considered to be in a SLOT in the page (the first row is in slot 0, etc).
    • Each row inside a data page has a header set of bytes (8 bytes in length)
    • All Rows in tables are Variable Length and cannot span pages (10.5 allows some spanning)
    • Every column that is nullable contains an extra byte in the row structure in the data page indicating whether that column for that row is null
  • Every PAGE has a header of about 88 bytes
  • For COLUMN Organized tables
    • Organizes by COLUMNS instead of by rows, automatically generates indexes on each column
    • Designed to optimize analytics with db2 BLU acceleration
    • Values for each column in a table are stored in the same Extent (set of pages)

Tablespaces

As discussed previously, tablespaces are literally spaces on disk that have been set aside up front in which is managed and stored CONTAINERS, that contain extents, that contain pages, etc.

Db2 tablespaces have two distinctions, SMS (System Managed Storage), and DMS (Database Managed Storage). In the most simplistic terms, SMS basically lets the operating system have control over how large database objects are, while DMS lets the database itself, and the parameters a DBA sets on it, have control over managing database objects on file systems.

In more detailed terms,

For SMS:

  • In SMS tablespaces, the containers is a disk directory.
  • Each object in an SMS Tablespace is stored in one or more separate files under the disk directory.
    • DAT – data
    • INX – index
    • LRG – large data
  • SMS tablespaces are limited in size by
    • The size of the filesystem on which they reside
    • The upper limit of number of pages allowed in an SMS TS
  • Typically temporary tablespaces are SMS
  • SMS Performance is generally slower than DMS
  • As tables in an SMS tablespace are dropped, the files in which the table was stored are dropped and the space is available on the filesystem.

While for DMS:

  • In DMS Filesystem tablespaces, each container name is the name of the file which will hold the all objects created in the tablespace.
  • DB2 will allocate extents to each object (data, index or large) within each container in the tablespace.
  • DMS tablespaces can be expanded or reduced in size by the DBA.
  • Tables created in DMS tablespaces can separate Data, Index and Long into individual DMS tablespaces.
  • While each container file can be different sizes, the best practice is to keep each container the same size so the containers fill uniformly over time.
  • Pagesize and Extentsize are fixed and cannot be changed
  • Prefetchsize can be altered
  • Bufferpool association can be altered
  • NO FILE SYSTEM CACHING allows DB2 to tell the OS not to cache DB2 tablespace files in the OS filesystem cache. Can be turned on or off.
  • DROP TABLE RECOVERY writes required log records so that a dropped table can be recovered and to an export file with the ROLLFORWARD command.

The size limits for tables in DMS and non-temporary automatic are based on the page size and tablespace size (regular or large).

db2_work_2019_pic_2.png
Adjusting the page size and using appropriate tablespaces can drastically improve performance, and should be considered on a case-by-case basis.

Finally, we will discuss storage groups, what they are, and how they can be used effectively.

Storage Groups

Storage Groups are a mechanism for grouping different storage paths into logical groups. Storage Groups are often used for Fast, Medium and Slow speed storage devices. Storage Groups have a set of containers paths associated with that storage group.

  • When creating a tablespace, specify that the tablespace should be created in a storage group and do not specify containers. The containers for that tablespace are created on the storage paths in the storage group.
  • Tablespaces can be migrated between storage groups as needs dictate.

 

The following graphic illustrates the common practice of grouping data in storage groups based on how often the data from that group is needed. This can drastically optimize cost/performance metrics, especially when there are different types of disks in use with drastic speed differences (like those between SSD’s and traditional spinning hard drives).
db2_work_2019_pic_4.png

One final topic of importance related to storage groups, is the option of AUTOMATIC STORAGE. Storage groups can either be assigned AUTOMATIC STORAGE, or left database managed storage by default. Note, all db2 BLU (column-organized) tables, are managed in automatic storage tablespaces by default as well.

Regarding AUTOMATIC STORAGE:

Db2 has an autonomous storage management system called AUTOMATIC STORAGE, that can be set at either at the database level, or on a table by table level. Instead of manually having to create and set aside disk space and directories yourself, db2 will automatically handle the necessary disk management features when this feature is in use.

A set of Storage Paths are identified for use by Automatic Storage Tablespaces. All tablespaces then created with Automatic Storage share these Storage Paths, but have separate container files for each tablespace. Automatic Storage Paths are directories. (if that wasn’t clear already) The database manager manages the container and space allocation for each Automatic Storage Tablespace as they grow in use.

To set AUTOMATIC STORAGE at the database level either…

  • 1: When creating a database use the ON path clause to identify your Storage Paths or use ALTER DATABASE:
  • Issue an ALTER DATABASE command as such:
db2_work_2019_pic_5.png
  • In V10.1 and newer, all databases are created with automatic storage unless you specify the AUTOMATIC STORAGE NO clause

 

In conclusion, there is much more to Db2’s disk management than one may initially assume. All of the facets of the system exist for a good reason, and can be taken advantage of once one has an understanding of how and why they work. For further reading on all of this, of course check out the latest versions knowledge center: https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.1.0/com.ibm.db2.luw.welcome.doc/doc/welcome.html

Check back on this blog soon, as new articles will be published on other important topics in Db2 on a weekly basis!


#Db2
1 comment
28 views

Permalink

Comments

Thu April 25, 2019 04:46 AM

Very Good