Let's look at some of the potential caching technologies for the application.
GlobalKTable as Cache
GlobalKTable is known for its global replication and distributed store. Due to its global replication, all the application instances will have the complete topic data. It will have all the real-time data.
The GlobalKTable is stored in the local store; it can be accessed without much latency.
Using GlobalKTable, we can store policy data and use it for the incoming events after transforming the data into a policy module.
An application with far too many policies can overload the memory. Upon application restart, building the table with all the policies might also take time. We need not execute every policy against each event, so fetching a particular policy module from GlobalKTable isn’t straightforward.
KCache
KCache is an in-memory cache that provides persistence by backing up the data in a Kafka topic. KCache uses a log-compacted Kafka topic to persist the data. The data from the topic is read and stored in an in-memory Kafka cache, which implements SortedMap from Java Utils. The log-compacted topic helps to store at least the last known value for each message key.
KCache is completely handled by Kafka, so we need not worry about the complexities of persisting the data. The data is available remotely. In case of an app restart or an in-memory data loss, the data can still be recovered from Kafka.
We can use KCache to store the policy data on KCache, but we can’t store a policy module directly, as there isn’t any built-in OPAModule Serde.
Redis
Redis, Remote Dictionary Server, is an open-source, in-memory database that stores data as key-value pairs. Redis offers high-speed read-write with a lower response time.
Redis also provides a couple of options for persisting data.
RDB (Redis Database): Redis captures a snapshot of the dataset at specified time intervals, thereby persisting the data on the disk. RDB is not a great option if your application is very strict about data loss.
AOF (Append Only File): Redis logs every write operation the application performs in AOF. All the logged operations are replayed on application restart to restore the dataset's original state.
RDB + AOF: You can also combine the two, giving you a faster restart.
For the policy engine application, Redis can store the policy data instead of storing it in the file system. But again, Redis doesn’t provide an option to store the policy modules directly.
Caffeine
Caffeine Cache is a high-performance, in-memory caching library for Java. It is based on Google’s Guava cache but has various improvements. Caffeine uses the Window TinyLFU eviction policy, which uses frequency to determine the past usage of an entry.
TinyLFU decides based on the access history whether to add the accessed item to the cache by eliminating the eviction candidate.
Caffeine is purely in-memory and does not provide any persistence.
We can use Caffeine to store our policy modules in the cache. Although we cannot store all the policy modules in the cache, we can let Caffeine's near-optimal eviction policy decide what to cache. Policy modules can be fetched immediately upon a cache hit as they are stored directly in the cache.
Caffeine was best suited for the application after considering the number of incoming events and the policies.
Decisions, Decisions. These options make you feel like you're spoilt for choice, but remember that choosing the cache depends on your use case, what you will be caching, performance requirements, and scalability.