By default, you won't have information in the InfluxDB database about where a particular server is located, nor whether it is for production or testing purposes. Certainly, it would be very useful to be able to show server utilization, for example, only in one Data Center, or only from one environment. Without using variables in Grafana, this is not so simple, and manually setting server/LPAR names in Grafana panels can be very annoying.
The goal is to achieve a selection panel in the dashboard as below. Selecting the appropriate Data Center, environment type, and machine class automatically limits the visible dashboard charts to only the relevant machines and LPARs.
To achieve this effect, you must go into Settings in your dashboard, select the Variables option, and create new variables. Note that some will be "custom" variables, while others will be "Query" types, which will query the InfluxDB database.
I personally recommend creating a separate dashboard for machine statistics and another dashboard for LPAR statistics. There are so many data points possible to collect that presenting everything in one place might require scrolling through the dashboard and it could lose its simplicity. In the case of a dashboard with machine statistics only, the LPAR variable should, of course, be omitted.
WARNING: all texts contained in the "CUSTOM OPTIONS" and "QUERY" fields must be written as a continuous string. There should be no line break characters anywhere. Unfortunately, I had to wrap the text in the article, so this may be misleading.
"DATACENTER" custom variable
You can divide servers by location using the Custom type variable. Pay attention to the format of the entries. Of course, you need to replace any names with those corresponding to your real machines.
Correct syntax: GROUP1 : item1|item2|item3,GROUP2 : item4|item5|item6
Be careful with space characters and do not use line break characters. Everything must be in a single line.
DC1 : Server1_S1022|Server2_S1022|Server3_S1022|Server10_E1050|Server11_E1050|Server20_E1080|Server21_E1080, DC2 : Server4_S1022|Server5_S1022|Server6_S1022|Server12_E1050|Server13_E1050|Server22_E1080|Server23_E1080, DC3 : Server7_S1022|Server8_S1022|Server9_S1022|Server14_E1050|Server15_E1050|Server24_E1080|Server25_E1080
Example in Grafana:
Remember to select the Multi-value option. Notice the value preview at the bottom of the page.
"ENV" custom variable
The rules are the same as for "DATACENTER"
TEST : Server1_S1022|Server2_S1022|Server4_S1022|Server5_S1022|Server7_S1022|Server8_S1022|Server10_E1050|Server12_E1050|Server14_E1050,PROD : Server3_S1022|Server6_S1022|Server9_S1022|Server11_E1050|Server13_E1050|Server15_E1050|Server20_E1080|Server22_E1080|Server24_E1080|Server21_E1080|Server23_E1080|Server25_E1080
Example in Grafana:
"TYPE" custom variable
The rules are the same as for "DATACENTER"
Scale-out : Server1_S1022|Server4_S1022|Server7_S1022|Server2_S1022|Server5_S1022|Server8_S1022|Server3_S1022|Server6_S1022|Server9_S1022,Midrange : Server10_E1050|Server12_E1050|Server14_E1050|Server11_E1050|Server13_E1050|Server15_E1050,Enterprise : Server20_E1080|Server22_E1080|Server24_E1080|Server21_E1080|Server23_E1080|Server25_E1080
Example in Grafana:
"SERVERNAME" query
In this case, the variable is of the QUERY type and uses information from the InfluxDB database. Pay attention to the conditions used in the query, such as "DATACENTER:pipe
", "TYPE:pipe
", and "ENV:pipe
". Using such a query allows you to limit the data only to those selected in the filtering panel.
SHOW TAG VALUES WITH KEY = "servername" WHERE servername =~/^*(${DATACENTER:pipe})$*/ AND servername =~/^*(${TYPE:pipe})$*/ AND servername =~/^*(${ENV:pipe})$*/
Example in Grafana:
"LPAR" query
This query displays the names of LPARs, but only those that match a given server. The query displays only the latest names to avoid duplicating names (e.g., after migrating an LPAR between machines) and is limited only to LPARs from the last 30 days to avoid displaying outdated information of old LPARs. Of course, you can choose to omit this last condition.
SELECT last("name") FROM "lpar_details" WHERE ("servername" =~ /^$SERVERNAME$/) AND time > now() - 30d GROUP BY "lparname"
Example in Grafana:
All variables
The entire set of variables should look as follows. A "WARNING" sign might appear if some of the variables are not used in the dashboard or by another variable.
Queries in panels
In Grafana panels, you should use the variable name instead of hard-coding the LPAR or machine names. Do it as in the example below:
SELECT "currentVirtualProcessors" FROM "lpar_processor" WHERE ("lparname" =~ /^$LPAR$/) AND $timeFilter GROUP BY "lparname"
SELECT "availableProcUnits" FROM "server_processor" WHERE ("servername" =~ /^$SERVERNAME$/) AND $timeFilter GROUP BY "servername"
Thats all regarding servers filtering :) A certain complication is that variables have to be entered for each dashboard separately, so it's worth cloning dashboards instead of creating them from scratch. Remember to update information when, for example, a new physical server appears in your environment or is moved between locations.
Some changes may not be visible until you save the dashboard. It's also common for the browser to remember old names in filters - try clearing the browser cache (CTRL + F5) in that case.
Setting up the legend in a Grafana panel - Time series graph
In the case of the Time Series Graph, it's beneficial to utilize the ability to sort by legend values. This way, you can easily check, for example, which LPARs utilize the CPU the most or have the highest peaks. Change the legend to a table form and select the values that interest you.
Example of legends sorted by Max value:
Stacked or unstacked?
A very useful feature in Grafana is the ability to create STACK Groups, which are independent of the general STACK settings of the panel. This allows for the summing of selected data while maintaining other values constant.
For example, you may want to sum up on the graph the utilization data of multiple LPARs across multiple machines and at the same time, you would like the chart to make a separate stack for TOTAL CPU of physical machines and another for the utilization values of LPARs, but without combining them together (I hope I didn't confuse you too much :) ).
To achieve this, simply add in the "OVERRIDE" option for the selected query a STACK and name the group with an appropriate name, e.g., TOTAL for queries about the data of all machine CPUs, and UTIL for queries about utilization data.
Change InfluxQL query editor to Text editor mode (RAW)
The editor is certainly a convenience, but I think when working with Grafana and InfluxDB, you'll find it beneficial to learn and perform more complex tasks using "RAW MODE". To switch between modes, click the pencil icon next to the query.
Use tags
Notice that in the QUERY, in the "Alias by" field, you can use tags, for example:
- tag_servername
- tag_lparname
- tag_viosname
You can use them if you previously used appropriate names in the query as GROUP BY. The corresponding names will be displayed in the legend, which is extremely useful.
Summary
I hope you found this text useful. If the topic is of interest to you and you would like more advice, or if you are interested in examples of dashboards for monitoring the Power platform, feel free to contact with to me :)