AIX

 View Only

 nimon: Two Years of Data Not Appearing in Dashboards

William Woods's profile image
William Woods posted Wed July 02, 2025 11:04 AM

Dear Community:

We have been running nimon for over two years, collecting statistics from about 400 AIX- and Linux-based systems and retaining 105 weeks of data. The InfluxDB database & Grafana are installed on a RHEL 7.9 (3.10.0-1160.135.1.el7.x86_64 #1 SMP Tue May 13 01:53:34 EDT 2025 x86_64 x86_64 x86_64 GNU/Linux kernel) host which was updated on June 24.

Today, when I opened Grafana, none of the dashboards were populated with data. I found the influxDB daemon was not running. I started it and checked the dashboards but still no data was displayed, so I superstitiously restarted the Grafana service. After several minutes, new data was being plotted on the dashboards but the previous 105 weeks of statistics were missing. Our polling interval is 5 minutes so that explains the delay in new statistics being graphed.

Versions:

nimon: a mix of 7.1 & 8.3

influxDB: 1.8.3

Grafana: 11.5.1

We keep track of how much NAS storage is consumed by the influxDB database & it doesn't seem like the data actually was lost:

date        Filesystem         1048576-blocks  Used    Available  Capacity  Mounted on
2025-06-09  nasnfsco:/njnimon  608996          529229  79767      87%       /nimon
2025-06-16  nasnfsco:/njnimon  608996          529863  79134      88%       /nimon
2025-06-23  nasnfsco:/njnimon  608996          530635  78361      88%       /nimon
2025-06-30  nasnfsco:/njnimon  608996          531816  77180      88%       /nimon
2025-07-02  nasnfsco:/njnimon  608996          527002  81995      87%       /nimon

Retention policy:

> show retention policies on njmon
name    duration  shardGroupDuration replicaN default
----    --------  ------------------ -------- -------
autogen 17640h0m0s 168h0m0s           1        true

After initial setup, nimon has "just worked" all these years so I'm not sure how to begin troubleshooting. Any suggestions are welcome.

Thanks & Best Regards,

William

Nigel Griffiths's profile image
Nigel Griffiths IBM Champion

Hi,

First, I would note that you appear to be running an out-of-date version of InfluxDB 1.8 - something like from 2022

From the website https://www.influxdata.com/downloads/

I see the current download for RHEL for AMD64 servers, is done with this command for version 1.11

wget https://download.influxdata.com/influxdb/releases/influxdb-1.11.8.x86_64.rpm
sudo yum localinstall influxdb-1.11.8.x86_64.rpm

I would upgrade in case you have hit an already fixed bug.

You have already verified that there is sufficient disk capacity - full marks.

I would also check for errors reported by the OS (RHEL) and NAS, in case your NAS has failed.

Then I would check the InfluxDB and Grafana log file in case they are reporting errors.

Next, I would create a very simple NEW Grafana and check whether you can see the missing data period or not.

This might eliminate any problems in data access.

Finally, you could go see the underlying data inside InfluxDB. This is not easy and is made harder due to the size of your database, as a simple SQL-like command, your output may be millions of rows of data. 
Very briefly, a starter for ten:

Start the test access with the influx command (from memory, you need to find the V1 documentation for the CLI)). Assuming you have a virtual machine called "fred".

influx

> show databases

> use njmon

> show measurements

> show tag keys

> show field keys

> select last(*) from timestamp where ("host"="fred" )

> select count(*) from timestamp

Note: timestamp is a measurement name (a small measurement). Might take a long time as it counts for all time and all servers!

> select count(*) from timestamp where time > '2025-06-01'

Guess here - I am not online to check:

> select count(*) from timestamp where ( time > '2025-06-01' and "host" = "fred")

You have backups of the InfluxDB, right?

Best of luck, N

William Woods's profile image
William Woods

Thank you for your help, Nigel! Updating the influxd package to v1.11.8 fixed the problem.

Apologies for the delayed response but I needed to schedule a planned change for today (2025-07-09) to update the package and I just completed it.

Best Regards,
William