Global Mailbox: Avoiding Resurrected Data
What is Global Mailbox?
IBM Sterling Global Mailbox helps companies address demands for high availability operations and redundancy with a robust and reliable data storage solution available across geographically distributed locations. It is an add-on to Sterling B2B Integrator and Sterling File Gateway.
How does Global Mailbox work?
Global Mailbox uses several key concepts to provide a highly resilient B2Bi or SFG deployment.
Redundancy
Each deployment includes multiple instances of each component within the datacenter to ensure that services are always available.
The solution is deployed across multiple data centers to ensure that if there is a full data center outage, there’s another data center to accept requests and provide business continuity.
Data Replication
Mailbox data is replicated within and across data centers to reduce the risk of losing data. The system always stores multiple copies of mailbox data across multiple servers.
What database does Global Mailbox use?
Global Mailbox uses an open-source NoSQL database called Apache Cassandra. Cassandra is responsible for replicating data across many nodes and data centers to provide a consistent, highly available view of the mailbox data in Sterling File Gateway and Sterling B2B Integrator.
What is a Zombie (resurrected data)?
Since Cassandra replicates data across many nodes, and since nodes can go up and down, there are opportunities for deleted data to be resurrected. Sometimes this is referred to as Zombies.
Any data in Cassandra can be resurrected under certain circumstances. One example seen in the field is Global Mailbox messages/files re-appearing after they have been deleted. It’s very important to understand how this can happen so that you can prevent it from impacting your business. Resurrected data can be difficult to clean up because Cassandra has no method to identify it.
This article explains a couple examples on how data can be resurrected and how to prevent it.
Data resurrections due to system clocks being out of sync
Cassandra is a replicated data store and is highly fault tolerant. There are many nodes across many data centers.
In healthy situations, all Cassandra nodes are communicating with each other and data is quickly synchronized. To support high availability, Cassandra nodes can operate independently of the other nodes. This can result in conflicts as data is created, changed and deleted. For example, you may have two updates to the same row with a conflicting value for the “extraction_counter” column. One update may go to Cassandra Node 1, and another may go to Cassandra Node 2. If these nodes cannot communicate, there needs to be a method of resolving the conflict when they resume communication.
To resolve these conflicts, Cassandra uses a “last timestamp wins” approach. Each data change (mutation) is marked with a timestamp. When returning query results, Cassandra returns the data with the latest timestamp.
The timestamp for each mutation is provided by the client connected to Cassandra. The client is the B2Bi node, the SFG node, or the GM Admin node. If the clocks on these nodes are out of sync, the conflict resolution may not result in the desired data being for queries.
Here’s an example:
- B2Bi Nodes 1’s clock is 5 seconds behind B2Bi Node 2’s clock
- A partner connects to B2Bi Node 2 and uploads a file à all Cassandra inserts, updates are tagged with timestamp 1:15:20 pm
- The file is routed by node B2Bi Node 1
- The routing logic deletes the file/message from the producer mailbox à the Cassandra inserts, updates and deletes are tagged with a timestamp of 1:15:15 pm
- A day later, the partner issues an “ls” command to view the files that were uploaded, expecting the routed files to not appear à Cassandra looks at the mutations (inserts, updates, deletes) and resolves conflicts based on timestamps. The end result is that the “inserts” from step 2 win and it appears like the file was never routed!
How to prevent resurrected data due to clock issues
Ensure all clocks are in sync across all nodes. This includes all B2Bi/SFG nodes and all Cassandra nodes. You can do this by using an NTP server.
See Configuring Time Synchronization in the Global Mailbox documentation.
Data resurrections due to Cassandra synchronization problems
Deletes in Cassandra are not immediate. Initially data is marked for deletion. This marker is called a tombstone. The tombstone is replicated across all Cassandra nodes. After a period of time, a process called compaction is responsible for permanently deleting the data and the tombstones. This period of time is called the gc_grace_period.
Since Cassandra is a replicated database, some nodes may go up and down and this will not affect mailbox operations. However, this results in nodes becoming out of sync with the other nodes.
The Cassandra Reaper runs in the background to ensure that nodes are repaired (resynchronized) on a regular basis. This ensures that all data (including tombstones) is replicated to all nodes.
If there is an outage, and the Reaper is not running or it’s not repairing fast enough the following can occur.
Assumptions:
- Cassandra cluster has only 3 nodes
- Repairs aren’t running
- gc_grace_period is set to 2 days (this is the default for tables related to messages)
Flow of events:
- Dec 1 8:00 am: A partner uploads 10 files to their mailbox à all messages are stored properly in all Cassandra nodes
- Dec 1 8:01 am: A power outage causes Cassandra Node 3 to fail. This doesn’t impact operations as there are 2 other Cassandra nodes available.
- Dec 1 8:04 am: SFG routes the 10 files uploaded by the partner à Cassandra creates tombstones for the files deleted from the producer mailbox. These tombstones are sent only to Node 1 and Node 2. Node 3 does not have the tombstones.
- Dec 2: Cassandra Node 3 is still down. More files uploaded and routed. Node 3 becomes more out of sync.
- Dec 3: Cassandra Node 3 is still down. More files uploaded and routed. Node 3 becomes more out of sync.
- Dec 4: compactions happen on Cassandra Node 1 and Node 2. The compaction process completely removes all data that was deleted more than 2 days ago. The tombstones are also deleted. This results in all rows for the files uploaded on Dec 1 to be completely removed from Cassandra. Since Cassandra Node 3 is still down, Node 3 still has the rows for those files but it doesn’t have the tombstones to mark them for deletion.
- Dec 5: Cassandra node 3 is finally fixed and re-joins the cluster. This node never received the tombstones and by re-joining it has effectively resurrected all files that the partner uploaded on Dec 1
- Dec 6: The partner logs in and does an “ls” on their mailbox. They see a file is resurrected and raises it as a concern.
How to prevent resurrected data due to synchronization/repair issues
Ensure that the Cassandra Reaper is running at all times and that it successfully repairs all Global Mailbox keyspaces in Cassandra every gc_grace_seconds. Since the gc_grace_seconds for the message tables is 2 days, you must ensure that the repairs complete every 2 days (or faster). If a Cassandra node goes down, it must be brought up and repaired with in 2 days. If you can't have a plan that allows this, contact IBM support for other suggestions on how to avoid the problem.
Summary
If the Global Mailbox system isn’t configures properly and not maintained well, data can be resurrected. This can result in processed files coming back to life and potentially be processed again.
It’s important that:
- All clocks are synchronized across all nodes
- Cassandra nodes are repaired at least once very 2 days
- A failed Cassandra node is brought back and repaired before 2 days have elapsed