Hi Yeswanth,
Not sure why you placed the question on this forum as njmon is open source but no harm done.
The njmon repository is here https://nmon.sourceforge.io/pmwiki.php?n=Site.Njmon
And I can be contacted here nigel ar griffiths at hotmail.com - you will have to correct the spaces and @ sign :-)
It took a while but I found the problem.
The njmon C code has a sanity check of 256 Logical CPUs numbered 0 to 255.
- Clearly, computers are getting big over time in the number of logical CPUs and physical CPUs.
- I have never had access to a Linux computer this large before.
- Although, the Power10 servers max out are 240 physical CPUs with max Logical (SMT=8) of 1920.
- There is also a logistics problem of graphs with more the 200 lines become a complete mess as there is too many lines drawn one on top of the other so you can't see through the birds nest.
- You will notice the njmonchart has problems showing the key for the CPU and it comes in in your case 15 lists of CPU.
- Then there is a problem with the human eye can't really detect accurately 200 different colours so if the "blue" CPU is a then there are say 10 "blue" CPUs in the key list.
- I do not have a simple solution for this problem.
When our grand children are running njmon :-) with 100,000 CPUs they will not want a line chart but will perhaps use a heat-map so see how many hot CPUs are active.
In addition, there is a programming problem in allocating memory structure space for high numbers of CPU (same goes for disks etc).
The memory is a waste of memory resources is the servers only has a dozen CPUs.
The GNU guys would call the 256 maximum a static magic number problem. Their coding standard forbid such magic numbers.
In the short term we can change the maximum to say 2000 and recompile njmon but that just postpones the problem.
In the medium term, njmon can count the number of CPUs and then allocate the correct memory size.
But there is a further problem as the number of CPUs can dynamically change live!
Or let the user override the default 256 with a command line option of shell variable - but this requires the user reading the manual!!
In the longer term, njmon will have to check every snapshot if the number of CPUs has gone up or down and adjust the memory sizes.
I will release a new njmon for Linux and include another fix for a big bug found yesterday - ASAP as version njmon_linux_v85.c at https://nmon.sourceforge.io/pmwiki.php?n=Site.Njmon
If you want to recompile your version change njmon_linux_v83.c line 2170
#define MAX_LOGICAL_CPU 256
to
#define MAX_LOGICAL_CPU 2048
The utilisation structure was but grows from 20K to 164K = small beer these days.
I would very muck like a njmon capture sample from your Linux server for testing proposes once you have the improved njmon version.
And to code the medium term solution.
Thanks for the Post - a good community contribution helping other njmon users.
------------------------------
Nigel Griffiths - IBM retired
London, UK
@mr_nmon
------------------------------
Original Message:
Sent: Thu January 09, 2025 12:38 PM
From: Yeswanth Jojode
Subject: NJMON CPU data collection
Hi Nigel,
We are utilizing the NJMON tool for collecting system metrics. However, we have recently encountered a peculiar issue with its functionality. Specifically, the tool seems to have a limitation or is exhibiting unexpected behavior when gathering CPU-related data.
The problem arises when the system's CPU count exceeds 256 cores. Beyond this threshold, NJMON is not collecting or reporting CPU metrics.


We would appreciate your insights or suggestions on how to address this issue. Is this a known limitation of NJMON, or might there be a configuration setting or workaround to resolve this?
Many thanks in advance for your support.
Kind regards
Yeswanth
------------------------------
Yeswanth Jojode
------------------------------