File and Object Storage

 View Only

IBM Storage Ceph Object Storage Tiering Enhancements. Part One

By Daniel Alexander Parkes posted 23 days ago

  

 IBM Storage Ceph Object Storage Tiering Enhancements. Part One

Introduction

IBM Storage Ceph offers object storage tiering capabilities to optimize cost and performance by seamlessly moving data between storage classes. These tiers can be configured locally within an on-premise infrastructure or extended to include cloud-based storage classes, providing a flexible and scalable solution for diverse workloads. With policy-based automation, administrators can define lifecycle policies to migrate data between high-performance storage and cost-effective archival tiers, ensuring the right balance between speed, durability, and cost-efficiency.

Local storage classes in IBM Storage Ceph allow organizations to tier data between fast NVMe or SSD-based pools and more economical HDD-based pools within their on-premises IBM Storage Ceph Cluster. This is particularly beneficial for applications requiring varying performance levels or scenarios where data "ages out" of high-performance requirements and can be relegated to slower, more economical storage. 

In addition to local tiering, IBM Storage Ceph offers Policy-Based Data Archival and Retrieval capabilities to integrate with S3-compatible platforms for off-premises data management. Organizations can use this feature to archive data to cloud-based tiers such as IBM Cloud Object Storage, AWS S3, Azure Blob or S3 Tape Endpoints for long-term retention, disaster recovery, or cost-optimized cold storage. By leveraging policy-based automation, Ceph ensures that data is moved to the cloud, based on predefined lifecycle rules, enhancing its value in hybrid cloud strategies. 

IBM Storage Ceph 8.0 New Feature:  Policy-Based Data Retrieval

Initially, IBM Storage Ceph's Policy-Based Data Archival to S3-compatible platforms offered a uni-directional data flow, where data could only be archived from local storage pools to the designated cloud storage tier. While this allowed users to leverage cost-effective cloud platforms for cold storage or long-term data retention, the lack of retrieval capabilities limited the solution’s flexibility in data management. This meant that once data was archived to cloud storage, it could no longer be actively retrieved or re-integrated into local workflows directly through Ceph.

As of IBM Storage Ceph 8.0, we are introducing Policy-Based Data Retrieval, which marks a significant evolution in its capabilities and is now available as a Tech Preview. This enhancement enables users to retrieve S3 cloud or tape transitioned objects directly into their on-prem Ceph environment, eliminating the limitations of the previous uni-directional flow. Data can be restored as temporary or permanent objects.

  • Temporary restores: The restore bypasses lifecycle cloud-transition rules and is automatically deleted after the specified duration, reverting the object to its previous stub state. 
  • Permanent restores: These fully reintegrate objects into the Ceph cluster, where they are treated like regular objects and subjected to standard lifecycle policies and replication processes.

This Retrieval of objects can be done in two different ways:

  • S3 RestoreObject API. Allowing users to retrieve objects from the remote S3 endpoint using the S3RestoreObject by API request
  • Read-through Object Retrieval. Enabling standard S3 GET requests on transitioned objects to restore them to the Ceph cluster transparently.

 

In this release, we don't support object retrieval from S3 cloud/tape endpoints that use the Glacier API, like for example IBM Deep Archive. This feature enhancement is targeted for the next release of IBM Storage Ceph.

Policy-Based Data Archival Step-by-Step Walkthrough

In this section, we will configure and set up the Policy-Based Data Archival feature of IBM Storage Ceph Object as a prerequisite to exploring the new feature available in version IBM Storage Ceph 8.0: Policy-Based Data Retrieval. We will discuss using data lifecycle policies to transition cold data to a more cost-effective storage class by archiving it to IBM Cloud Object Storage (COS). 

 

Ceph Terminology we will be using:

  • Zone Group: A collection of zones located mainly in the same geographic location (a.k.a. Region).

  • Zone: One or more instances that contain Object Gateway endpoints and are part of a zone group.

  • Placement: Logical separation of RADOS data pools within the same zone.

  • Storage Class: Storage classes customize the placement of object data, and S3 bucket lifecycle rules automate the transition between these classes.

Life Cycle Policies Introduction

The table below provides a summary of the various lifecycle policies that The Ceph Object Gateway supports:

Policy Type Description Example Use Case
Expiration Deletes objects after a specified duration. Removing temporary files automatically after 30 days.
Noncurrent Version Expiration Deletes noncurrent versions of objects after a specified duration in versioned buckets. Managing storage costs by removing old versions of files.
Abort Incomplete Multipart Upload Cancels multipart uploads that are not completed within a specified duration. Freeing up storage by cleaning up incomplete uploads.
Transition Between Storage Classes Moves objects between different storage classes within the same Ceph cluster after a duration. Moving data from SSD to HDD storage after 90 days.
NewerNoncurrentVersions Filter Filters noncurrent versions newer than a specified count for expiration or transition actions. Retaining only the last three noncurrent versions of an object.
ObjectSizeGreaterThan Filter Applies the lifecycle rule only to objects larger than a specified size. Moving large video files to a lower-cost storage class.
ObjectSizeLess Filter Applies the lifecycle rule only to objects smaller than a specified size. Archiving small log files after a certain period.

In addition to specifying policies, lifecycle rules can be filtered using tags or prefixes, allowing for more granular control over which objects are affected. Tags can identify specific subsets of objects based on per-object tagging, while prefixes help target objects based on their key names.

Configuring the Remote Cloud Service for Tiering

First, we need to configure the remote S3 cloud service as the future destination of our on-prem transitioned objects. In our example, we will create an IBM COS bucket named ceph-s3-tier.

It is important to note that we need to create a service credential for our bucket with HMAC keys enabled.

Creating a New Storage Class for Cloud-S3 Tiering

Create a new storage class on the default placement within the default zonegroup; we use the special RGW --tier-type=cloud-s3, to configure the storage class against our previously configured bucket in COS S3.

# radosgw-admin zonegroup placement add --rgw-zonegroup=default --placement-id=default-placement --storage-class=ibm-cos --tier-type=cloud-s3

We can verify the available storage classes in the default zone group and placement target:

# radosgw-admin zonegroup get --rgw-zonegroup=default | jq .placement_targets[0].storage_classes
[
  "STANDARD_IA",
  "STANDARD",
  "ibm-cos"
]

Configuring Cloud-S3 Storage Classes with Tier Configurations

Next, we use the radosgw-admin Command to configure the cloud-S3 storage class with specific parameters from our IBM COS bucket, such as the endpoint, region, and account credentials:

# radosgw-admin zonegroup placement modify --rgw-zonegroup default  --placement-id default-placement --storage-class ibm-cos --tier-config=endpoint=https://s3.eu-de.cloud-object-storage.appdomain.cloud,access_key=YOUR_ACCESS_KEY,secret=YOUR_SECRET_KEY,target_path="ceph-s3-tier",multipart_sync_threshold=44432,multipart_min_part_size=44432,retain_head_object=true,region=eu-de

Applying Lifecycle Policies

Once the COS cloud-S3 storage class is in place, we will switch the user to a consumer of the IBM Storage Ceph Object S3 API and configure a Life Cycle Policy through the RGW S3 API endpoint. Our user is called tiering, and we have the S3 AWC CLI pre-configured with the credentials for the tiering user.

# aws --profile tiering --endpoint https://s3.cephlabs.com s3 mb s3://databucket
# aws --profile tiering --endpoint https://s3.cephlabs.com s3 /etc/hosts s3://databucket

We will configure a lifecycle policy to the previously created bucket. For instance, the bucket name databucket will have the following policy, transitioning all objects older than 30 days to the COS storage class:

Lifecycle Policy JSON:

{
  "Rules": [
    {"ID": "Transition objects from Ceph to COS that are older than 30 days",
      "Prefix": "",
      "Status": "Enabled",
      "Transitions": [
        {
          "Days": 30,
          "StorageClass": "ibm-cos"
        }
      ]
    }
   ]
}

As an S3 API consumer, I will use the AWS S3 CLI, to apply the bucket lifecycle configuration I have saved to a local file called ibm-cos-lc.json:

# aws --profile tiering --endpoint https://s3.cephlabs.com s3api put-bucket-lifecycle-configuration --lifecycle-configuration file://ibm-cos-lc.json --bucket databucket

Verify that the policy is applied:

# aws --profile tiering --endpoint https://s3.cephlabs.com s3api get-bucket-lifecycle-configuration  --bucket databucket

We can also check that Ceph/RGW has registered this new LC policy using the following radosgw-admin command; the status is UNINITIAL, as this LC has never been processed; once processed, it will move into the COMPLETED State:

# radosgw-admin lc list | jq .[1]
{
  "bucket": ":databucket:fcabdf4a-86f2-452f-a13f-e0902685c655.310403.1",
  "shard": "lc.23",
  "started": "Thu, 01 Jan 1970 00:00:00 GMT",
  "status": "UNINITIAL"
}

We can get further details of the rule applied to the bucket with the following command:

# radosgw-admin lc get --bucket databucket
{
    "prefix_map": {
        "": {
            "status": true,
            "dm_expiration": false,
            "expiration": 0,
            "noncur_expiration": 0,
            "mp_expiration": 0,
            "transitions": {
                "ibm-cos": {
                    "days": 30
                }
            },
        }
    }
}

Testing The Configured Lifecycle Policy

Important WARNING: changing this parameter is ONLY for testing purposes; don't use it near a Productive Ceph Cluster!. 

We can speed up the testing of lifecycle policies by enabling a debug interval for the lifecycle process. In this setting, each "day" in the bucket lifecycle configuration is equivalent to 60 seconds, so a three-day expiration period is effectively three minutes:

# ceph config set client.rgw  rgw_lc_debug_interval 60
# ceph orch restart rgw.default

If we now run the radosgw-admin lc list, we should have the LifeCycle for our transition bucket in a completed state:

[root@ceph01 ~]# radosgw-admin lc list| jq .[1]
{
  "bucket": ":databucket:fcabdf4a-86f2-452f-a13f-e0902685c655.310403.1",
  "shard": "lc.23",
  "started": "Mon, 25 Nov 2024 10:43:31 GMT",
  "status": "COMPLETE"
}

If we list the objects available in the transition bucket on our on-premise cluster, we can see that the objects are 0 in size. This is because they have been transitioned to the cloud. However, the metadata/head of the object is still available because of the use of the "retain_head_object": "true" parameter when creating the cloud storage class:

# aws --profile tiering --endpoint https://s3.cephlabs.com s3 ls s3://databucket
2024-11-25 05:41:33          0 hosts

If we check the object attributes using the s3api get-object-attributes call, we can see that the storage class for this object is now ibm-cos, so this object has been successfully transitioned into the S3 cloud provider:

# aws --profile tiering --endpoint https://s3.cephlabs.com s3api  get-object-attributes --object-attributes StorageClass ObjectSize --bucket databucket --key hosts
{
    "LastModified": "2024-11-25T10:41:33+00:00",
    "StorageClass": "ibm-cos",
    "ObjectSize": 0
}

If we check in IBM COS, using the AWS CLI S3 client, but with the endpoint and profile of the IBM COS user, we can see that the objects are available in the IBM COS bucket . Due to API limitations, the original object modification time and ETag cannot be preserved, but they are stored as metadata attributes on the destination objects.

aws --profile cos --endpoint https://s3.eu-de.cloud-object-storage.appdomain.cloud s3api head-object --bucket ceph-s3-tier --key databucket/hosts | jq .
{
  "AcceptRanges": "bytes",
  "LastModified": "2024-11-25T10:41:33+00:00",
  "ContentLength": 304,
  "ETag": "\"01a72b8a9d073d6bcae565bd523a76c5\"",
  "ContentType": "binary/octet-stream",
  "Metadata": {
    "rgwx-source-mtime": "1732529733.944271939",
    "rgwx-versioned-epoch": "0",
    "rgwx-source": "rgw",
    "rgwx-source-etag": "01a72b8a9d073d6bcae565bd523a76c5",
    "rgwx-source-key": "hosts"
  }
}

To avoid collisions across various buckets, the source bucket name is prepended to the target object name. If the object is versioned, the object version ID is appended to the end.

Below is the sample object name format:

s3://<target_path>/<source_bucket_name>/<source_object_name>(-<source_object_version_id>)

Similar semantics as those of LifecycleExpiration are applied below for versioned and locked objects. If the object is current post-transitioning to the cloud, it is made noncurrent with a delete marker created. If the object is noncurrent and locked, its transition is skipped.

Conclusion

This blog covered transitioning cold data to a more cost-effective storage class using tiering and lifecycle policies and archiving it to IBM Cloud Object Storage (COS). In the next blog, we will explore how to restore archived data to the Ceph cluster when needed. We will introduce the key technical concepts and provide detailed configuration steps to help you implement cloud restore, ensuring your cold data remains accessible when required.

Here is a link to continue reading the second blog.

0 comments
42 views

Permalink