IBM Storage Ceph

IBM Storage Ceph

Connect, collaborate, and share expertise on IBM Storage Ceph

 View Only

Disaster Recovery with Ceph RBD Namespace-Level Mirroring in IBM Ceph 8.1

By Sunil Angadi posted 7 days ago

  

Ceph RBD Mirroring

Ceph RBD mirroring is a disaster recovery (DR) feature that replicates RADOS Block Device (RBD) images from one Ceph cluster to another — typically from a primary site to a secondary (remote) site — using an asynchronous snapshot-based approach. In simpler terms, imagine you’re working on important documents in your office (primary cluster), and at regular intervals, a photocopy (snapshot) of your work is automatically sent to your friend’s secure office in another city (secondary cluster). If your office suffers a disaster — like fire, data loss, or hardware failure — you can visit your friend’s office and pick up right where you left off, using the most recent snapshot. Behind the scenes, a Ceph component called the rbd-mirror daemon monitors for snapshot changes and transfers only the incremental differences, making this setup efficientscalable, and ideal for business continuity. RBD mirroring supports both one-way and two-way configurations, and it can be enabled at the poolnamespace, or individual image level, giving you fine-grained control over what data is mirrored.

Snapshot-based mirroring

Snapshot-based mirroring in Ceph RBD is a method where only the changes between two points in time (i.e., snapshots) are replicated from a source cluster to a remote cluster, making the process efficient and incremental. When mirroring is enabled, Ceph creates periodic snapshots of an image (either manually or through automated scheduling), and the rbd-mirror daemon detects these snapshots and transfers only the new or modified data to the destination cluster. In simple terms, think of it like keeping a journal of your work: instead of sending the entire book every night to your backup location, you just send the new pages or edits made that day. This saves time, bandwidth, and storage. On the destination side, these snapshots are applied to reconstruct a near-live copy of the original image. Since it’s asynchronous, there might be a small delay, but this method offers a powerful balance of data protectionperformance, and storage efficiency — making it the recommended mode of mirroring in modern Ceph deployments.

Namespace-level mirroring

Namespace-level mirroring in Ceph RBD allows users to selectively replicate only the images that belong to a specific namespace within a pool, rather than mirroring the entire pool or individual images manually. Technically, a namespace is a logical partition inside a pool — like a subdirectory — that lets you group RBD images under an isolated label (e.g., tenant1finance, or dev). When mirroring is enabled at the namespace level, the rbd-mirror daemon only replicates the images from that specific namespace using snapshot-based mirroring, ignoring others in the same pool.

Architecture view of namespace-level mirroring

Disaster Recovery 

Disaster Recovery (DR) in the context of Ceph RBD refers to the ability to quickly restore access to your critical data and applications in the event of a major failure — such as a site-wide power outage, hardware failure, data corruption, ransomware attack, or natural disaster. Ceph provides DR by replicating data (like RBD images) from one cluster (the primary site) to another geographically separate cluster (the secondary site) using snapshot-based mirroring. In layman’s terms, think of DR as a spare key to your digital home stored safely with a trusted friend in another city. If your home is destroyed or inaccessible, you can go to your friend’s place, use the spare key, and keep living your life with minimal disruption. Similarly, DR in Ceph ensures that your data is always recoverable and usable, even when the primary site goes down. It’s not just a backup — it’s a live, near-real-time copy of your data, ready to be promoted and used in emergencies with minimal downtime.

RBD Mirror Daemon

The RBD mirror is a specialized daemon process (rbd-mirror) in Ceph that is responsible for automatically replicating RBD images from one Ceph cluster to another — enabling disaster recovery and data protection through snapshot-based mirroring.

It continuously monitors for new snapshots on images that are enabled for mirroring and transfers only the changed data (incremental diffs) to the remote cluster, making the process efficient and asynchronous.
In simpler terms, think of the RBD mirror daemon as a robot courier that quietly watches over your digital storage and, every time it notices something new or changed, it delivers a copy of just that change to a secure location far away. It works behind the scenes, needs minimal manual intervention, and ensures that your mirrored images stay nearly in sync across two clusters. Each cluster runs its own rbd-mirror daemon, and in bidirectional mirroring setups, each robot is responsible for sending and receiving updates for different namespaces or images.

One-way RBD Mirroring 

In Ceph RBD, mirroring is the process of replicating RBD images from one cluster to another for disaster recovery. In a one-way mirroring setup, only one cluster (the primary) owns and writes to the RBD images, and the other cluster (the secondary) receives a read-only replica of those images via the RBD mirror daemon. This is ideal for active-passive scenarios.

One-way mirroring

Two-way RBD Mirroring

Two-way mirroring (also called bidirectional mirroring) involves both clusters being capable of writing to their own distinct sets of images or namespaces, and mirroring them to each other — enabling active-active configurations. Technically, the two-way setup ensures no conflict by making each cluster the primary for different images or namespaces. Two-way mirroring is like you and your friend each having your own set of documents and regularly exchanging copies, so both of you act as a backup for each other. This setup provides mutual protection, higher availability, and supports regional load balancing.
Two-way mirroring

How it Works: Under the Hood – Namespace-Level Mirroring

⭐ Key Components

Namespace

  • A logical subdivision within a pool to organize related RBD images.

  • Acts like a “folder” inside the pool but with its own mirroring configuration.

  • Created and managed using: rbd namespace create <pool>/<namespace>

Snapshot-based Mirroring

  • Relies on RBD snapshots for change tracking and replication.

  • Works with:

    • Manual namespace-level snapshots

    • Scheduled namespace-level snapshots

  • Mirroring occurs at the namespace scope, meaning all mirror-enabled images in that namespace are mirrored.

RBD Mirror Daemon

  • Extended to handle namespace-scoped replication.

  • Tracks snapshots at the namespace level instead of per-image or per-pool.

  • Ensures all mirrored images in the namespace are consistent with their source snapshots.

Snapshot Scheduling

  • Policy that defines how often snapshots are taken for the namespace.

  • Can be set using: 

  • Schedules can be different for:

    • Pool level

    • Namespace level

    • Image level

Data Movement

  • Primary cluster → Secondary cluster

  • For each snapshot, only incremental changes since the last mirrored snapshot are replicated.

  • Namespace mirroring applies these changes to the corresponding namespace on the target cluster.

Failover / Failback

  • Namespace mirroring supports promoting images in the target namespace to primary role during DR scenarios.

  • Failback replays changes from the target → source when the original primary recovers.

Step-By-Step Guide

🎖️Pre-requisites

  1. A minimum of two running IBM Storage Ceph clusters

  2. Ceph Version 8.1 or later. Ensure both clusters (primary and secondary) run compatible versions

  3. Running rbd-mirror daemon on secondary for one-way mirroring and on both (primary and secondary) for two-way mirroring.

  4. Root level access to nodes

To set non-default namespace-level mirroring 

The following example enables image mode mirroring between image-pool/namespace-a on the first cluster and image-pool/namespace-b on the second, remote, cluster.
Note: If the --remote-namespace option is not provided, the namespace is mirrored to a namespace with the same name in the remote pool.

[root@rbd-client ~]#  rbd mirror pool enable image-pool/namespace-a image --remote-namespace namespace-b

[root@rbd-client ~]#  rbd mirror pool enable image-pool/namespace-b image --remote-namespace namespace-a

[root@rbd-client ~]# rbd mirror pool info -p image-pool --namespace namespace-a
Mode: image
Mirror UUID: 77301310-0cb8-4a21-a170-b18f57f98fa5
Remote Namespace: namespace-b

To set the default namespace level mirroring 


Note
:
When configured in init-only mode, no images in the default namespace of the pool will be mirrored but other namespaces can still be configured.
This is needed to allow some other namespace to be mirrored to the default namespace of the remote pool but can be useful on its own as well.

[root@test-server2]# rbd mirror pool enable --site-name site-b test-pool init-only

Configure mirroring on pool/remote_namespace to mirror to the default namespace on cluster 1. If the Remote Namespace field is not empty, the default namespace is not configured correctly.

[root@test-server2]# rbd  mirror pool enable test-pool/test-ns --remote-namespace "" image
[root@test-server2]# rbd mirror pool info test-pool/test-ns
Mode: image
Mirror UUID: 1e5e1b21-441b-4111-a475-5eb7fd19ab1b
Remote Namespace:

To disable mirroring on a namespace with rbd, specify the mirror pool disable command and the namespace.

rbd mirror pool disable POOL_NAME/NAMESPACE

To disable mirroring on the default namespace, run the mirror pool enable command with the init-only mode.

Example

[root@test-server1]# rbd --cluster site-a mirror pool enable test-pool init-only
[root@test-server2]# rbd --cluster site-b mirror pool disable test-pool/test-ns

Use cases on namespace-level mirroring 

1. 🏢 Multi-Tenant Cloud Providers

Scenario: A cloud service provider hosts multiple tenants in the same RBD pool, each isolated by namespaces like tenant-atenant-b, etc.

Why Namespace Mirroring?
They only want to mirror premium tenants (e.g., Gold-tier) to the DR site. Others may be on self-service or dev plans and don’t need replication.

Benefit: Saves cost and bandwidth while delivering DR guarantees to high-paying customers.


2. ☁️ Hybrid Cloud Deployment (On-Prem to Cloud)

Scenario: A company runs workloads on-prem in the default namespace but uses AWS or another Ceph cluster in the cloud for DR.

Why Namespace Mirroring?
They can mirror only production apps from the default namespace to a remote namespace like ns-prod-cloud without replicating dev/test workloads.

Benefit: Efficient DR without bloating cloud storage with unnecessary data.


3. 🏥 Healthcare Provider (Regulatory Compliance)

Scenario: A hospital uses Ceph to store both patient data and analytics workloads.

Why Namespace Mirroring?
They mirror only the namespace containing Protected Health Information (PHI) to meet HIPAA compliance, leaving non-sensitive data out.

Benefit: Regulatory compliance + reduced DR scope.


4. 🛒 E-commerce Platform

Scenario: An e-commerce platform separates image data into namespaces like orderscartrecommendations.

Why Namespace Mirroring?
Only orders and cart namespaces need to be mirrored — recommendations can be rebuilt from logs later.

Benefit: Protect critical stateful data, ignore transient analytics data.


5. 🧪 Dev/Test Isolation

Scenario: Dev and staging environments are also using Ceph but shouldn’t be part of the DR plan.

Why Namespace Mirroring?
Mirror only the prod namespace and skip dev or qa.

Benefit: Avoid unnecessary DR noise, reduce sync time, simplify failover.


6. 🧑‍⚖️ Government Agency (Data Sovereignty)

Scenario: A national agency stores classified and non-classified workloads in separate namespaces.

Why Namespace Mirroring?
They mirror only classified workloads to secure DR clusters while keeping public or non-sensitive workloads isolated.

Benefit: Compliance with national data laws and better risk segmentation.


7. 🛡️ Managed Security Services

Scenario: A company offers logging and incident response services using Ceph-backed storage.

Why Namespace Mirroring?
Only customer-specific logs (namespace: client-XYZ) are mirrored for premium clients. Internal dashboards are not mirrored.

Benefit: Per-client DR policies and audit readiness.


8. 📊 Big Data / Analytics Pipelines

Scenario: Teams use different namespaces for ingestion, processing, and long-term cold storage.

Why Namespace Mirroring?
Only mirror raw-ingest and live-results namespaces; archive data can be retrieved from long-term backup later.

Benefit: Mirror fast-changing datasets; skip slow/cold data.


9. 📱 Mobile Gaming Company

Scenario: Mobile game backend uses RBD for live game state, logs, and leaderboards — separated into namespaces.

Why Namespace Mirroring?
Only game-state and leaderboards are mirrored. Raw gameplay logs are skipped.

Benefit: Ensure players’ progress is always safe without mirroring unnecessary telemetry.


10. 🧬 Research Lab (Per-Project Protection)

Scenario: Each project or experiment has its own namespace (e.g., covid-studyai-research).

Why Namespace Mirroring?
Mirror only mission-critical namespaces, such as ongoing clinical studies. Internal training workloads are not mirrored.

Benefit: Focus DR resources on live science, reduce waste on sandbox work.

Benefits of Namespace-Level Mirroring

  • Cost Optimization – Avoid mirroring low-value datasets.

  • Compliance – Target only sensitive or regulated data.

  • Performance – Reduce DR sync times and network load.

  • Flexibility – Fine-grained control at the namespace scope.

Conclusion

Namespace-level mirroring in Ceph RBD empowers organizations to customize their disaster recovery strategy by focusing on the data that matters most.
With incremental, snapshot-based replication, efficient namespace targeting, and flexible configuration options, Ceph delivers a modern, scalable DR solution for enterprises across industries.

For more details, please refer to the IBM Storage Ceph Documentation.

0 comments
18 views

Permalink