Welcome to Part 2 of our Guardium Big Data Intelligence blog series! To read the first blog, click here.
Data is the most important asset of almost any organization these days, and therefore protecting data repositories just gets more and more important. There has always been a focus on protecting databases, but as the data landscape has evolved, the need to protect data in its various forms and repositories has evolved as well. Data lives in databases – true – but it also lives in files, in Hadoop systems, on the cloud and more. The data managed by Guardium has never been more complex or expansive, and the technology has exhibited significant growth and development to meet the new requirements created by this data explosion. Furthermore, Guardium customers have always been required to keep additional data related to access of the data repositories so that that they can do historical reporting, but as the environments have grown and become more diverse, doing this has become more challenging.
Additionally, the compliance landscape is constantly evolving. One only has to look at the increased retention requirements that NY DFS introduced (requiring three years of on-line retention).
Finally as customers move towards analytics-based security and user behavior analysis, the need for online and immediate access to history has become crucial; how do you build robust and reliable analytic models if you cannot leverage years of data?
The answer to all these requirements in the Guardium world is Guardium Big Data Intelligence (GBDI) or “BigG” as we like to call it. Rather than collectors and aggregators that retain data for 10 - 30 days, Guardium Big Data Intelligence easily and cost-effectively retains Guardium data for 13 months, 3 years or even more, simplifying retention for the benefit of users. Rather than deal with archive files per machine, purge cycles, and the manual work involved in archive file restores, Guardium Big Data Intelligence is used as a single, simple long-term data repository where all data is online until purged and where all retention is managed by one simple policy (see below).
Amazingly, the storage costs are actually decreased with Guardium Big Data Intelligence, while at the same time extending the retention periods. Guardium Big Data Intelligence uses both compression and de-duplication of data to optimize storage. As a result, Guardium Big Data Intelligence customers gain between 7x and 10x storage savings for archived data and up to 30x storage savings when redesigning the storage footprint for the entire deployment (including taking out aggregators and reducing the storage footprint for collectors). So not only is data retention simplified but also significant money is saved!
As an example, take a leading US bank with around 200 Guardium collectors. Their 1 year retention requirement is easily supported by a single Guardium Big Data Intelligence node with 7TB of storage. As part of the move from aggregators to Guardium Big Data Intelligence, the bank freed up over 40TB of SAN storage. And when they had to run a query for an event 9 months in the past, they had no impact or hardship – the data was there and available and no one had to go and restore archive files.
As Hannibal from the A team said: “I love it when a plan comes together”.