Introduction:
In the midst of a security incident, time is of the essence. In order to minimize potential issues, it is crucial to minimize time to detection and time to response. So today I wanted to discuss some of the concepts and mechanics associated with data ingestion in the context of “real-time analytics in a SIEM. I’ll address where there might be latency during data ingestion and the benefits of streaming data into a SIEM vs. collecting it from an Object Storage bucket.
Background QRadar Integrations:
Let’s break a QRadar integration down into two fundamental pieces: (1) how the data gets to QRadar; and (2) how QRadar parses the data. QRadar leverages a “Protocol” to get data and there are several types of protocols. There’s TRUE push protocols: HTTP receiver, TLS Syslog, and Syslog where devices send data to QRadar and QRadar listens for it.
There’s scheduled API polling (i.e. Microsoft Graph Security, API Protocol, and VMware vDirector Protocol) where QRadar make queries with some kind of parameter on a schedule to a device to retrieve data. And finally, there are “streaming” type solutions where we are almost connected live to the stream or near live and get the next record as soon as it’s available to consume (Amazon Kinesis Data Streams). Once data is received by QRadar it is parsed into normalized fields by our Device Support Modules (DSMs) and analyzed in real time by our correlation engines and our analytics models.
As our users have expanded their cloud footprints we have worked to increase our integrations with cloud native services. There are two trends in the patterns or flows of data egress from a Cloud Native Service to QRadar: (1) via an Object Storage Bucket; or (2) streamed directly into QRadar. An example of an ingestion pattern that uses Object Storages is ingesting data from Amazon Elastic Kubernetes (EKS) into QRadar via an S3 bucket. An example of streaming data directly from a source into QRadar is streaming Azure Platform logs into QRadar via an Event Hub.
Where does latency come from ?
QRadar can correlate data in real-time, meaning that as soon as data hits the QRadar analytics pipeline, analysis can begin. QRadar is shipped out of the box with use-cases that will begin to detect threats as data is ingested. This means that threat detection in QRadar is proactive and does not depend on searches. Any latency in a detection with QRadar is caused by latency on the data provider side.
The causes of latency can be broken into two pieces: (1) how fast the data is written to a place that QRadar can pull from or how fast a service publishes its data to an event pipeline or storage bucket for pulling; and (2) how frequently data is pushed or pulled to QRadar. It is also important to note that some services work on an “eventual delivery” model (i.e. Microsoft Office 365 and Amazon CloudWatch Logs) meaning that QRadar consumes on a delay as messages can come in later with the created time of the original event.
In these cases, the QRadar development team has factored in the collection delay in order to make sure data is not missed. Any QRadar protocol that accounts for collection delays has a configurable value in it so the customer can balance latency of the data coming in against the potential of missing records.
Ingestion from Object Storage:
Object storage (COS) or “buckets” are a common component of cloud architectures (AWS S3, Azure Blob, IBM COS, Alibaba COS, Oracle COS) and are often leveraged for data storage and data aggregation. Cloud providers often provide a means for their services to publish their logs or security findings to their Object Storage Buckets for further analysis or long-term storage for auditing and compliance requirements. Therefore, organizations often standardize their logging by sending logs to these buckets. Once all the data is centrally located, it is often ingested from that place to the SIEM for easier set-up.
However, there are limitations of collecting from Object Storage Buckets. Any COS solution has some inherent delay as data is aggregated before being sent to a SIEM. Oftentimes the data is not written to the Object Storage buckets in real time as data is usually aggregated until a file size is hit or an interval (e.g. 100MB file, or one file every 5 minutes minimum) and is sometimes on a delay even within the platform vs. a comparable real time solution from the same provider (i.e. AWS CloudTrail to S3 vs. sending to Kinesis Data Streams). This introduces latency. Additionally, data may or may not be polled into a SIEM in real time; it may be polled at intervals (i.e.1 min, 2 min, 3 min) which again introduces latency.
While there are limitations from a ‘real-time’ perspective, many organizations do find that centralizing their logging to a COS approach has benefits from an ease from a permissions standpoint for the organization.?
Streaming Use Case:
Another option or pattern for ingesting data into a SIEM is to stream it directly as it is being written from a service to a SIEM. Cloud providers often provide streaming services such as Amazon Kinesis Data Streams, Azure Event Hubs, Apache Kafka, or Google Cloud Pub/Sub.
Streaming data into QRadar will always be the fastest way to get data to QRadar because we are actively connected to those data sources and there is no dependency on polling intervals. The only dependency is on how fast data is written into the data pipeline.
Conclusion:
Hope you found this helpful! Please reach out if you have any further questions! Big THANK YOU to Chris Collins for his contributions to this post!
Thanks,
Wendy
Sources:
- Cloud Watch Logs: (here)
- Services that Publish to Cloud Watch Logs: Here
- CloudTrail Sending to Cloud Watch Logs: Here
- QRadar AWS Web Services Protocol: Here
- Retrieving Data from Cloud Watch Logs: Here
- Kinesis Data Stream FAQ: Here
- Kinesis Product Page: Here
- DSM Guide QRadar and Kinesis Data Streams: Here
- Getting Started with Kinesis Data Streams: Here