HACP & HATS User Group - Group home

Cache-Control in HATS Web Application

  
Caching is a technique that stores a copy of a given resource and serves it back when requested. When a web cache has a requested resource in its store, it intercepts the request and returns a copy of the stored resource instead of re-downloading it from the originating server.
Some of the advantages of caching are:
  • Improved Response Time: Caching allows faster content retrieval because the content need not be fetched from the server each time.
  • Low Network Usage: When content is cached closer to the client-side like in the browser cache, it can make retrieval instantaneous.
Disadvantages of caching are:
  • Stale data: If the resources in the server are updated for a new release, a client might be looking at stale data as the browser is still referring to cached data of the earlier release.
This blog describes a way to overcome serving stale data to the user and manage the caching of static resources such as CSS, JS, and images in the HATS web application. 
The sections below explain 
  • Caching strategies
  • Cache-control header directives
  • Cache behavior with and without Cache-control directive
 

1.     Caching Strategies

  • HTTP Caching
HTTP Caching in web applications can be achieved by adding 
  1. ETag:Entity tag is used when the cached files have expired. The web browser uses ETag in its requests to determine if a stale copy exists in the cache.
  1. Cache-Control: This header contains parameters for control validation, cache behavior, and expiration. We will discuss this later in the document.
  2. Expires: This header defines when the resources stored in the cache expire. When the expiry date reaches, the browser will consider the content stale. 
  3. Last modified: Modified content is indicated in this header. The information includes the date and time of modification.
 
  • Cache Busting
Cache busting solves the browser caching issue by using a unique file version identifier to tell the browser that the latest file version is available. Therefore, the browser requests the origin server for the new file.
  1. Versioning:Add a version number to the filename
  2. Fingerprinting:Add a fingerprint based on the file contents
  3. Append Query String:Append a query string to the end of the filename. Slightly less preferred method.

The sections below describe HTTP Caching using the Cache-Control directive in the header of an HTTP response.

2.     Cache-control header directives

Some of the directives of this header include:
  • no-cache:The no-cache header tells the browser that you can reuse the contents from the cache, but after validating it with the originating server to verify if the resource has changed
  • public:The browser or any intermediary party (like CDN or proxies) can cache the web resources.
  • private:This means only the browser can cache the web resources.
  • no-store:This directive instructs the browser not to cache and fetch the content freshly from the server for each request.
  • must-revalidate: The Cache must verify the status of stale resources before using them. Should not use expired resources.
  • max-age: This represents the maximum amount of time for a resource to be fetched from the cache. This directive is relative to the time of the request and overrides the Expires header (if set). Every other request exceeding the max-age time will be downloaded freshly.
 
Each directive serves a different purpose. Based on the requirement, we can decide which cache strategy best suits our Web application. In the next section, we will look at how we can introduce the Cache-control directive 'max-age' in our HATS Web application and its impact on the cache.

3.     Cache behavior with and without Cache-control directive

The below sections describe scenarios with and without the Cache-control directive.

3.1  Default behavior without Cache-Control directive

  1. Create a folder under Web Content and place all the required CSS/JS/IMAGE files
Figure 1 Project Structure
  1. Once the application is running, in inspect window under the Network tab, check the js/img/css files loaded from /common/TestFolder/* URL. The first time the files are downloaded from the server, as shown below.Figure 2 Resources fetched the first time.
e.g., for test1.js, we do not see any Cache-Control tag in Response Headers. The figure below is from the Network tab, without any filters implemented in the project.
Figure 3 HTTP Response header for the resource

3. On subsequent requests to the URL, the resources like js/img/css files do not download freshly but are loaded from the cache, as seen in Figure 4.
Figure 4 Subsequent request served from cache.

3.1  Implement Cache-Control in HATS Application
1.     Modify HATS web.xml to add a filter entry. This filter class will be responsible for adding max-age directive in the HTTP header.
<filter id="cacheFilter">
              <description>Changes the Cache-Control header for static resources</description>
              <display-name>Cache Control</display-name>
              <filter-name>cacheFilter</filter-name>
              <filter-class>filters.CacheFilter</filter-class>
              <init-param>
                     <param-name>maxAge</param-name>
                     <param-value>300</param-value><!-- Equivalent to 5 mins -->
              </init-param>
       </filter>
       <filter-mapping>
              <filter-name>cacheFilter</filter-name>
              <url-pattern>/common/TestFolder/*</url-pattern>
       </filter-mapping>

2.     Add java class CacheFilter
Create a java class ‘CacheFilter’ that implements Filter interface. Add Cache Control: max-age = init-param value from web.xml, to the response header of all requests that match the url pattern.

public class CacheFilter implements Filter {
       private static final String CACHE_CONTROL_PREFIX = "max-age=";  
       private static long DEFAULT_MAX_AGE = 300;   //Equivalent to 5 mins
       private long maxAge = DEFAULT_MAX_AGE;   
       public void destroy() {
       }
       public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
             ((HttpServletResponse) response).setHeader(HttpHeaders.CACHE_CONTROL, CACHE_CONTROL_PREFIX+maxAge);
             chain.doFilter(request, response);
       }
       public void init(FilterConfig fConfig) throws ServletException {
             String maxAgeStr = fConfig.getInitParameter("maxAge");  
             if (maxAgeStr != null && maxAgeStr.trim().length() > 0) {  
                    try {  
                           maxAge = Long.parseLong(maxAgeStr);  
                    } catch (NumberFormatException e) {  
                      e.printStackTrace();  
                    }  
             }
       }
}


3.2  Verify Cache-Control directive
1. Add the project's filter class described in section 3.2 and re-deploy it on the server. Clear the browser for the first time.
2. Once the application runs, inspect the window under the Network tab to verify the js/img/css files. The first time these files are downloaded from the server, as shown below.
Figure 5 Resources fetched on the first request.

e.g., for test1.js, we see 'Date' in the response header; it indicates when the resource is requested. A new attribute added to the response header called "Cache-Control: max-age" it displays the time in seconds, the resource must cache relative to the 'Date.'
Max-age 300 secs specified for testing. It means any resource must cache for 5 mins. The resource test1.js, first downloaded on 24 Feb 06.56, as in the below image; the resource is cached until06.56+ 300 secs(5 mins) = 07.01
Figure 6 Cache-Control tag present in Response Header

3. The resources will be served from cache for any request made to the URL before 07.01
Figure 7 Next request served from the cache before max-age

Click test1.js, and we can see the resource done from the disk cache. It can be seen in Status Code below.
Figure 8 Resource served from the cache before max-age time.

4. The below request is made after 7 mins, i.e., 07.03. We can see that the resources are no more cached but once the max-age expires. It requires a fresh download.
Figure 9 Resources after max-age expires.

Figure 10 Shows resource is freshly fetched.


4.     References
https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching
https://docs.oracle.com/javaee/7/api/javax/servlet/Filter.html
https://www.ibm.com/docs/en/hats/9.7.0
 
5.     Contact us
For further information on Cache-Control in HATS Application, automation capabilities, and Lab services offerings, please write to:
ZIO@hcl.com

Akshata Betageri
Developer, Lab Services, IBM HACP & HATS

Comments

Wed July 06, 2022 11:14 PM

Very useful and detailed information on HATS. Thank you for sharing