IBM Verify

IBM Verify

Join this online user group to communicate across Security product users and IBM experts by sharing advice and best practices with peers and staying up to date regarding product enhancements.

 View Only

Schedule High-Volume Database cleanup tasks with Kubernetes or OpenShift

By Lachlan James Gleeson posted 2 days ago

  

Scheduling High-Volume Database cleanup tasks with Kubernetes

In the 11.0.1.0 release of IBM Verify Identity Access, administrators gained the ability to run High-Volume Database (HVDB) cleanup tasks via API or CLI requests.

This article will demonstrate how you can use this capability along with the Kubernetes CronJob resource definition to schedule HVDB cleanup during particular times of the day, or days of the week, ect.

Container run mode

The configuration container was updated with a new role for running the HVDB cleanup tasks. If the HVDB_CLEANUP_TASKS environment property is set when the configuration container is bootstrapping, the management interface only listens on localhost and the given cleanup tasks are run (until the HVDB_CLEANUP_TIMEOUT is reached, or 1 hour).

Trace is output to the container's log file while the cleanup tasks are run, recording how many rows were removed and how long the SQL commit operation took. Administrators can also use monitoring tools like Instana or Glowroot to collect additional metrics about cleanup task performance.

Defining the cleanup job

The trick to getting this job to run is bootstrapping the configuration container with an existing configuration snapshot. Typically configuration snapshots are generated by the configuration container, published, then unpacked by runtime containers. For this example, the snapshot will provide the configuration container with the database connection information needed to run the cleanup tasks.

To bootstrap a configuration container, you will need to set the SOURCE_CONFIG_SERVICE_URL, SOURCE_CONFIG_SERVICE_USER_NAME, SOURCE_CONFIG_SERVICE_USER_PWD and SOURCE_CONFIG_SERVICE_TLS_CACERT environment variables. For a full list of all the options available, see the configuration container knowledge center documentation. For this example we will rely on the Verify Access Operator to host the configuration snapshot, and use the verify-access-operator secret generated by the operator to supply the source authentication information.

We also need to define the cleanup tasks that we need to run. In this example we will just run the entire list of available tasks, but you should run tasks that are relevant to your deployment needs. The example CRON schedule is set for 3am on a Sunday night, typically a low traffic period for most enterprises. You can update this schedule to best suit your deployment. Additionally, administrators could add a HVDB_CLEANUP_TIMEOUT property to limit how long cleanup tasks are run for, if the default 1 hour limit is not sufficient.
 
The resulting Kubernetes CronJob yaml looks like:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: ivia-hvdb-clean
spec:
  schedule: "0/3 * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          volumes:
          - name: iviaconfigvol
            emptyDir: {}
          - name: opeartor-cert
            secret:
              secretName: verify-access-operator
              items:
              - path: verify-access-operator.crt
                key: tls.cert
          containers:
          - name: ivia-hvdb-cleanup
            image: icr.io/ivia/ivia-config:11.0.1.0
            imagePullPolicy: IfNotPresent
            volumeMounts:
            - mountPath: /var/shared
              name: iviaconfigvol
            - mountPath: /tmp/verify-access-operator.crt
              name: opeartor-cert
              subPath: verify-access-operator.crt
            command:
            - "bash"
            - "-c"
            - "mkdir -p /var/shared/snapshots && /usr/sbin/bootstrap.sh"
            env:
            #- name: SNAPSHOT_ID
            #  value: "idp-published"
            - name: HVDB_CLEANUP_TASKS
              value: mmfaTransactions,mmfaAuthenticators,oauthToken,dmapCache,deviceRegistrations,authsvcSessionCache
            #- name: HVDB_CLEANUP_TIMEOUT
            #  value: "90"
            - name: SOURCE_CONFIG_SERVICE_URL
              valueFrom:
                secretKeyRef:
                  name: verify-access-operator
                  key: url
            - name: SOURCE_CONFIG_SERVICE_TLS_CACERT
              value: file:/tmp/verify-access-operator.crt
            - name: SOURCE_CONFIG_SERVICE_USER_NAME
              valueFrom:
                secretKeyRef:
                  name: verify-access-operator
                  key: user
            - name: SOURCE_CONFIG_SERVICE_USER_PWD
              valueFrom:
                secretKeyRef:
                  name: verify-access-operator
                  key: ro.pwd
          restartPolicy: OnFailure

Note: The cron schedule is set to run every three minutes! This is probably too frequently for any deployment but it means you won't have to wait very long for the schedule to start. Make sure you update the schedule to something less frequent before deploying to production.

Verifying cleanup with trace

Once a cleanup tasks has successfully run, a summary of how many records were removed and the time taken are returned as JSON data. Administrators can process these log messages to verify tasks ran successfully, and act accordingly:

{"type":"bootstrap","host":"ivia-hvdb-clean-29199072-tgkgb","ibm_userDir":"\/","ibm_serverName":"runtime","message":"WGAWA1007I   The High Volume Database cleanup tasks completed successfully.","ibm_threadId":"1","ibm_datetime":"2025-07-08T03:13:46+0000","ibm_messageId":"WGAWA1007I","module":"/usr/sbin/bootstrap.sh","loglevel":"AUDIT"}
{"mmfaTransactions":{"lastRan":"Tue Jul 08 03:12:55 GMT 2025","runTime":"670 ms","recordsRemoved":0,"status":"IDLE"},"mmfaAuthenticators":{"lastRan":"Tue Jul 08 03:13:04 GMT 2025","runTime":"106 ms","recordsRemoved":0,"status":"IDLE"},"authsvcSessionCache":{"lastRan":"Tue Jul 08 03:13:14 GMT 2025","runTime":"0 ms","recordsRemoved":0,"status":"IDLE"},"deviceRegistrations":{"lastRan":"Tue Jul 08 03:13:25 GMT 2025","runTime":"396 ms","recordsRemoved":0,"status":"IDLE"},"oauthToken":{"lastRan":"Tue Jul 08 03:13:34 GMT 2025","runTime":"147 ms","recordsRemoved":0,"status":"IDLE"},"dmapCache":{"lastRan":"Tue Jul 08 03:13:44 GMT 2025","runTime":"103 ms","recordsRemoved":0,"status":"IDLE"}}

If the cleanup threads determine that they are being run on both the management interface and aac/fed runtime, a warning message is logged in the management servers log file, suggesting that the automatic cleanup thread be disabled (by setting it frequency to -1):

{"type":"liberty_message","host":"ivia-hvdb-clean-29199072-tgkgb","ibm_userDir":"\/opt\/ibm\/wlp\/usr\/","ibm_serverName":"default","message":"FBTFDB015W TokenCleanupThread cleanup thread appears to be running in multiple places, this could lead to High-Volume Database errors.","ibm_threadId":"0000007a","ibm_datetime":"2025-07-08T03:13:34.678+0000","module":"TokenCleanupThread","loglevel":"WARNING","ibm_methodName":"run","ibm_className":"TokenCleanupThread","ibm_sequence":"1751944414678_00000000000AD","ext_thread":"Thread-54"}

Tuning the cleanup tasks

The 11.0.1 release also added a number of configuration properties for controlling how frequently SQL commit operations are performed. All of the cleanup tasks now have the option of specifying the SQL batch size to use when removing rows. SQL batch operations concatenate multiple similar statements together, and update the database in one single transaction lock. This is often more performant, particularly when a large volume of data needs to be removed.

For all cleanup tasks, it is recommended to use batch mode cleanup of at least 1000 transactions/rows where possible. For other tuning parameters, such as maximum number of transactions per user, ect. limits will depend on your business/regulatory requirement needs.

If administrators are using manual cleanup tasks to remove data from the HVDB, the automatic cleanup threads should be disabled in on the runtime. This can be done by setting the frequency advanced configuration property for the applicable thread to -1

for more information, check out the knowledge center documentation here

0 comments
4 views

Permalink