Business Automation Insights (BAI)

Business Automation Insights

Come for answers. Stay for best practices. All we’re missing is you.

 View Only

Track data usage in BAI

By NICOLAS SAUTEREY posted 3 days ago

  

Why measure usage in BAI

When you use BAI, you accumulate a large quantity of timeseries events and summary events (for stateful event sources).

Your storage consumption grows, query execution may take longer and longer, which slows down the Business Performance Center dashboards, etc. And you might not even realize easily how much and how fast the OpenSearch database gets filled.

We have imagined a solution that you can deploy in your environment to measure and monitor continuously the quantity of documents stored in the indices dedicated to Monitoring sources, the average and total size of documents in these indices. Our solution also computes this information globally across monitoring source indices. It intentionally does not measure storage in other indices used in BAI, which are not supposed to grow to large volumes.

How it is done at a high level

The main ideas in this solution are the following :

We wanted to provide, for each BAI monitoring source (in reality, for the indices used by these monitoring sources): the number of documents stored in the index, the average document size (in their JSON source shape), and the total cumulative document size. This is not the storage consumed by OpenSearch, which includes the space required to store these documents (and they are compressed, so there is no direct way to relate storage and document size), and the space required for indexation, query caches, duplication in shards, etc…). The storage space used by OpenSearch can be found easily with OpenSearch APIs, and technical observability tools like Instana can help you track how your disk space is evolving before the persistent volume is full.

So we created a shell script that runs queries on OpenSearch, iterating on the monitoring sources, and gets the amount of documents, and their sizes. If you are familiar with OpenSearch, you may know that there are no built-in APIs provided to get the source size of documents, but only the storage consumed (globally, or per index shard). So we had to imagine a workaround. What we do is we take a random sample of documents in the index (actually in the alias that represents the list of indices for a monitoring source), and we compute the average size of the documents returned. If we take a large enough sample set, the average is stable enough to get a good approximation. From this average and the count, we can compute the total size of documents in this alias.

Then we return the results as a CSV file, for you to use in any way you want.

By running this script on a scheduled basis (with a cron scheduler, or a cron like job in kubernetes), you can measure the evolution, forecast the future consumption, etc… Depending on your needs, you can choose an hourly or daily cadence, for instance.

But we wanted to provide a better experience for you to have access to this information. So we thought of storing these usage tracking reports, in JSON, in a dedicated index in OpenSearch itself, and create a BAI monitoring source around this index, so that you can create a dashboard to display the information. The shell script does this all for you. And actually, we built a dashboard and we are happy to provide it too as part of the solution. Because the script is run periodically, you will see how the numbers of documents, average sizes and total sizes are trending. Just like a regular monitoring source, you can control the visibility of this information to users by way of the Business Performance Center (BPC) tab dedicated to Permissions.

Use this dashboard as is or adapt it to your needs, it’s up to you!

Easter egg: When you run the script, you will see that it also includes a function that displays the number of fields currently used in the index mapping, for the index used for writing the documents (only one index per alias is concerned). This is useful when you want to know how close you are from a field mapping explosion issue. If you don’t know what this refers to, it’s a limitation in OpenSearch that may affect you one day or another. It basically prevents you from using more than 1000 (by default) fields in a given index. After this number is reached, no new record can be written in the index.

Solution details

The details of this solution are available in the following code sample ‘Usage tracking’ published in this public github repository : https://github.com/icp4a/bai-usage-tracking-sample

The two main files are the bai-data-metrics.sh bash script, and the BAI Data Metrics Tracking.json dashboard file.

In order to run the script, you need to define a few environment variables, with the OpenSearch endpoint URL, the OpenSearch user name and its password. Some oc commands may be required for you to find this information in your cluster. Also, you can pass a variable called environment that will be used in all the reports, and that you can use to define which environment you are running the script for. (Dev, test, prod, etc.. depending on your use case).

You can import the dashboard in BPC as a dashboard template, or modify it before import, if you don’t want the dashboard to be imported as a BPC template. You can then adapt it to your needs in terms of chart visualization, time range and intervals, etc. You can also configure your own alerts if you want to be notified when the number of documents or their cumulative size goes over a given threshold.

------------------------------
Nicolas Sauterey, Nithya Velma, Philippe Kaplan
from the BAI Development team
------------------------------

0 comments
8 views

Permalink