Cloud Pak for Data

Come for answers. Stay for best practices. All we’re missing is you.

View Only

Back to Blog List

Do you have big data sets and caching problems? Autocaching is a game changer

By Tina Chan posted Mon October 28, 2024 09:56 AM

Authored by: Tina Chan and Mitesh Vasa.

Image by Malik Johnson

Now you can more easily work with big data sets in IBM Data Virtualization on Cloud Pak for Data 5.0.3 without worrying about managing your caches. A new autocaching feature automates the cache lifecycle for you from creating caches to evicting caches. The autocaching feature manages only the caches it creates, not any user-defined caches that you create, so you can still manually manage your caches without interference.

In addition, you can customize the autocaching settings including:

How often autocaching runs
The amount of storage space that auto-generated caches should occupy
The type of queries in your workload that you want autocaching to analyze
The name of auto-generated caches

How can you use autocaching?

Consider the following ways in which the autocaching feature in Data Virtualization can help you with your workflow:

Problem	How autocaching addresses the problem
You find that creating and tuning caches manually can be too time-consuming in environments that have changing data demands.	Autocaching does the work for you to ensure that the most important queries are optimized so that the cache always reflects the current needs of the system.
You experience high latency from remote data sources that can lead to delays in decision-making processes.	Autocaching caches your most frequently used data, which reduces the need for time-consuming operations such as remote data joins, thereby speeding up queries. Caching frequently used data is helpful when you are querying data from remote data sources where network latency can cause performance issues.
You experience storage constraints when you work with larger queries.	Autocaching manages cache storage through a user-defined soft upper limit. When the total size of auto-generated caches exceeds this limit, autocaching evicts low-ranked caches to ensure that the system stays within storage limits.

How does autocaching work?

Autocaching uses the cache recommendation engine, which analyzes query workloads, and it generates a list of recommended caches based on several factors including frequency, cardinality, and run time. Autocaching then creates the top ranked caches in batches during its scheduled run.

Autocaching also automatically evicts caches that aren’t used frequently. This eviction process prioritizes removing low-ranked or unused caches, freeing up space for critical caches that might benefit you in the future.

The following diagram shows the autocaching process flow, including how it decides when to create and evict caches, and under what conditions these decisions are made.

Diagram showing the autocaching process, including when caches are created and evicted

Diagram by Malik Johnson

To use autocaching now, update your Data Virtualization service to the latest release.

Learn more

For information on how to enable and customize your autocaching settings, see Enabling autocaching in Data Virtualization in the Cloud Pak for Data documentation.
For information on how autocaching evicts and creates caches, see Autocaching in Data Virtualization in the Cloud Pak for Data documentation.
For information on how the cache recommendation engine works, see Cache recommendations in Data Virtualization in the Cloud Pak for Data documentation.

#CloudPakforData
#ibmtechxchange-ai
#ai-featured-area-3
#Featured-area-3

0 comments

34 views

Permalink

https://community.ibm.com/community/user/blogs/tina-chan/2024/10/23/gotbig-data-sets-and-caching-problems-autocaching

Cloud Pak for Data

Cloud Pak for Data

Do you have big data sets and caching problems? Autocaching is a game changer

By Tina Chan posted Mon October 28, 2024 09:56 AM

How can you use autocaching?

How does autocaching work?

Permalink

Additional
Resources

Office

Quick Links

Cloud Pak for Data

Cloud Pak for Data

Do you have big data sets and caching problems? Autocaching is a game changer

By Tina Chan posted Mon October 28, 2024 09:56 AM

How can you use autocaching?

How does autocaching work?

Permalink

Additional Resources

Office

Quick Links

Additional
Resources