IBM Guardium

 View Only

Universal Connector’s Traditional Approach vs Kafka approach

By Arwa Hussain posted Tue November 26, 2024 09:17 AM

  

The Universal Connector, designed to collect and forward security events and audit data from various data sources to GDP, was initially configured with Rsyslog for data transmission. However, this conventional setup of configuring with Rsyslog soon revealed significant limitations in enterprise environments and posed several operational challenges as listed below: 

  • Absence of load-balancing mechanism - making it difficult to handle large volumes of data 

  • Delayed Transmission and Data Loss – due to increased latency in audit data transmission during peak traffic periods, leading to compromised compliance and security 

  • Rsyslog’s limited disk buffering capacity - increasing the susceptibility to data corruption 

  • Lack of centralized fault tolerance - causing single point of failure risking complete data loss on the failed collector 

 

                                                                                                                       Fig1. Traditional rsyslog UC architecture 
 

WHY?  

Kafka mechanism provides: 

  • Load-balancing – is achieved by Kafka when more than 1 UC reads from the same Topic. 

  • Performance Enhancement – is achieved by UC when load-balancing comes into effect. When the traffic gets distributed amongst the UCs, each UC processes in a more organized and streamlined manner, thereby adding to throughput. 

  • Resilience and Reliability – is achieved by the replication factor of Kafka, wherein each message of a Lead broker is replicated on Follower brokers. Also, even if 1 of the UC connectors fails the other UC connectors continue to process the events. 

  • Data Integrity – is achieved by the replication factor, minimum in-sync replicas in Kafka. Even with brokers failing in a cluster, the replicated message in any and all of the surviving brokers still gets preserved and eventually processed by UC. 

 

WHAT?  

Kafka Cluster: 

A Kafka Cluster is a collection of multiple Kafka brokers working together to distribute, store and manage data in a scalable manner making it ideal for handling large-scale data flows. 

  

                                                                                                       Fig2. Kafka integration with Data Producers and Data Consumers 

A Kafka Cluster consists of the following components: 

  1. Brokers: 

  • These are the backbone servers of Kafka. 

  • They handle message storage, replication, and delivery. 

  • Each broker manages a part of the data, ensuring synchronization across the cluster. 

  1. Producers: 

  • Producers are applications or systems that send data to Kafka topics. 

  • Examples include database servers generating audit logs, security event sources, or monitoring tools. 

  1. Consumers: 

  • Consumers are applications that read and process data from Kafka topics. 

  • In Guardium, Universal Connectors act as consumers. 

  • They forward security events or audit logs to GDP collectors for compliance and monitoring. 

  1. Topics: 

  • Topics are logical channels that organize data streams. 

  • Each topic is configured for specific types of data or purposes. 

  • Retention policies can be applied to manage data over time. 

  1. Partitions: 

  • Topics are divided into smaller units called partitions. 

  • Partitions enable parallel processing of data for better performance. 

  • They also support data replication, ensuring fault tolerance and scalability. 

 

 

Implementation of UC with Kafka: 

 

                                                                                                                       Fig3. Overview of UC over Kafka architecture 

  1. Kafka Cluster Creation – Central Management unit’s Kafka cluster management tool is used to create and manage the Kafka cluster, preferably having at least 3 Kafka brokers in the cluster to achieve resilience and persistence. Ref: https://www.ibm.com/docs/en/gdp/12.x?topic=configuration-creating-kafka-clusters 

     2.  Rsyslog-Kafka communication - The cluster’s certificate is shared with Rsyslog for secure validation and communication.  

     3.  Rsyslog Configuration 

                              a.  The bootstrap_servers parameter is set as Kafka brokers’ DNS(with port 9093 by default)  

                              b.  The parameters – DB server IP, log directory path, cluster certificate and topic name is replaced with actual values
                                   For step-by-step details, refer IBM GitHub Universal Connectors 

      4.  UC configuration - Provide connector name to the UC template and save. 

      Currently, the Kafka UC feature is available on Guardium v12p20 and above. However, Guardium continues to support the traditional UC configuration with Rsyslog as well. 
 
     UC plugins over Kafka: 
     1. EDB Postgres 
      2. Yugabyte 

 

 

Authors - Arwa Hussain , Nabanish Sinha ,  

w3 Profile - Manish | Nabanish | Arwa
IBM TechXchange Profile - Manish | Nabanish | Arwa

0 comments
25 views

Permalink