IBM Security QRadar

IBM Security QRadar Disaster Recovery for AWS environments

By George Mina posted Tue November 24, 2020 01:00 PM


Co-Authors: Rory Bray and George Mina

Disaster recovery (DR) is a key element to protecting against availability zone (AZ) failures particularly in AWS environments where EC2 instances are hosted in multiple global locations. Those instances should be distributed across multiple AZ’s in order to reduce the risk of failure and enable requests to be handled in another AZ. If on the other hand, all instances are located in a single location and a failure occurred, none of those instances would be available.

QRadar provides a number of features related to DR with the recent availability of the IBM QRadar Data Synchronization App. This app provides a resilience solution for QRadar deployments to ensure that operations can continue to function as normal as possible in DR scenarios. If your hardware or network fails, IBM QRadar can continue to collect, store, and process event and flow data. 

In the context of AWS availability zones, QRadar can be configured to meet DR and HA requirements by leveraging AWS Lambda. The steps are as follows: 

  1. Setup primary and secondary deployments of QRadar with the Data Synchronization App
  2. Leverage AWS elastic load balancing to direct all traffic to primary availability zone first
  3. If/when primary zone fails, the AWS load balancer sends events through an Amazon CloudWatch alarm to the AWS Lambda function
  4. AWS Lambda is triggered and assesses the current state (what’s healthy/not healthy)
  5. API call is made to QRadar’s data sync app to initiate automated failover to secondary zone


Setup for CloudWatch alarm based on UnHealthyHostCount metric


The above steps show how users can automate fail-over scenarios for QRadar in AWS environments via multi availability zone resilience. 

Key Features

  • Simple configuration of HA solution via the Data Synchronization App for QRadar
  • Automate DR failover to secondary site to ensure business continuity
  • Automated data synchronization for data store resilience (QRadar events and flows) and processing site resilience
  • Centralized visibility for ‘health’ of target group hosts per AZ


Learn More: