Originally posted by: MatthewBourne
Folks
Been struggling to help my customer with this issue for some months now - unfortunately nothing tremendously useful coming out of the PMR route just yet. Wondering if anyone has seen similar symptoms?
Please help!
Thanks
Scenario Uptime is less than 60 days, memory all but exhausted, including paging space. Left alone the LPAR will probably crash, logging "out of resource" type messages in error log.
>oslevel -s 6100-03-03-0943
> lsattr -El mem0 ent_mem_cap I/O memory entitlement in Kbytes False goodsize 4096 Amount of usable physical memory in Mbytes False size 4096 Total amount of physical memory in Mbytes False var_mem_weight Variable memory capacity weight False
> lsps -a Page Space Physical Volume Volume Group Size %Used Active Auto Type Chksum hd6 hdisk0 rootvg 4160MB 56 yes yes lv 0
NMON tells me that the kernel is the biggest consumer of memory pages:
FileSystemCache(numperm) 11.1% Process 16.3% System 71.9% Free 0.8%
SVMON says pgsp is over 50% utilised...
> svmon -G -O unit=MB Unit: MB ------------------------------------------------------------------------------- size inuse free pin virtual available memory 4096.00 4060.28 35.7 985.27 5497.27 367.96 pg space 4160.00 2314.65 work pers clnt other pin 846.77 0 0 138.50 in use 3441.47 0 618.81
... but I've got nearly
2GB occupancy in pgsp
(7x256MB) due to kernel segments that show a minimal if not zero count of pages in use (~110MB). Over time, we'll observe the number of segments that look like this increase - until eventually the LPAR becomes unresponsive and ultimately crashes.
> svmon -S -t 10 -O unit=MB,filtercat=kernel,sortseg=pgsp Unit: MB Vsid Esid Type Description PSize Inuse Pin Pgsp Virtual 46008 - work kernel heap m 0 0 256.00 256.00 5600a - work kernel heap m 0 0 256.00 256.00 3e007 - work kernel heap m 0 0 256.00 256.00 4e009 - work kernel heap m 0 0 256.00 256.00 36006 - work kernel heap m 0 0 256.00 256.00 2e005 - work kernel heap m 15.0 0 241.00 256.00 5e00b - work kernel heap m 100.88 0.06 155.12 256.00 6a00 - work kernel heap m 106.94 8.88 106.25 115.25 4000 - work page table area s 9.45 0.17 23.7 23.9 28005 9ffffffd work shared library sm 0.28 0 7.66 7.66
VMO settings are as per recommendation, I believe:
>vmo -F -L NAME CUR DEF BOOT MIN MAX UNIT TYPE DEPENDENCIES ams_loan_policy n/a 1 1 0 2 numeric D force_relalias_lite 0 0 0 0 1
boolean D kernel_heap_psize 64K 0 0 0 16M bytes B lgpg_regions 0 0 0 0 8E-1 D lgpg_size lgpg_size 0 0 0 0 16M bytes D lgpg_regions low_ps_handling 1 1 1 1 2 D maxfree 1088 1088 1088 16 838860 4KB pages D minfree memory_frames maxperm 899409 899409 S maxpin 845956 845956 S maxpin% 80 80 80 1 100 % memory D pinnable_frames memory_frames memory_frames 1M 1M 4KB pages S memplace_data 2 2 2 1 2 D memory_affinity memplace_mapped_file 2 2 2 1 2 D memory_affinity memplace_shm_anonymous 2 2 2 1 2 D memory_affinity memplace_shm_named 2 2 2 1 2 D memory_affinity memplace_stack 2 2 2 1 2 D memory_affinity memplace_text 2 2 2 1 2 D memory_affinity memplace_unmapped_file 2 2 2 1 2 D memory_affinity minfree 960 960 960 8 838860 4KB pages D maxfree memory_frames minperm 29980 29980 S minperm% 3 3 3 1 100 % memory D maxperm% maxclient% nokilluid 0 0 0 0 4G-1 uid D npskill 8320 8320 8320 1 1M-1 4KB pages D npswarn 33280 33280 33280 1 1M-1 4KB pages D numpsblks 1040K 1040K 4KB blocks S pinnable_frames 796362 796362 4KB pages S relalias_percentage 0 0 0 0 32K-1 D scrub 0 0 0 0 1
boolean D v_pinshm 0 0 0 0 1
boolean D vmm_default_pspa 0 0 0 -1 100 numeric D wlm_memlimit_nonpg 1 1 1 0 1
boolean D ##Restricted tunables -------------------------------------------------------------------------------- cpu_scale_memp 8 8 8 4 64 B data_stagger_interval 161 161 161 0 4K-1 4KB pages D lgpg_regions defps 1 1 1 0 1
boolean D framesets 2 2 2 1 10 B htabscale n/a -1 -1 -4 0 B kernel_psize 64K 0 0 0 16M bytes B large_page_heap_size 0 0 0 0 8E-1 bytes B lgpg_regions lru_file_repage 0 0 0 0 1
boolean D lru_poll_interval 10 10 10 0 60000 milliseconds D lrubucket 128K 128K 128K 64K 1M 4KB pages D maxclient% 90 90 90 1 100 % memory D maxperm% minperm% maxperm% 90 90 90 1 100 % memory D minperm% maxclient% mbuf_heap_psize 64K 0 0 0 16M bytes B memory_affinity 1 1 1 0 1
boolean B npsrpgmax 65K 65K 65K 1 1M-1 4KB pages D npsrpgmin npsrpgmin 49920 49920 49920 1 1M-1 4KB pages D npsrpgmax npsscrubmax 65K 65K 65K 1 1M-1 4KB pages D npsscrubmin npsscrubmin 49920 49920 49920 1 1M-1 4KB pages D npsscrubmax num_spec_dataseg 0 0 0 0 8E-1 B page_steal_method 1 1 1 0 1
boolean B psm_timeout_interval 20000 20000 20000 0 60000 milliseconds D rpgclean 0 0 0 0 1
boolean D rpgcontrol 2 2 2 0 3 D scrubclean 0 0 0 0 1
boolean D soft_min_lgpgs_vmpool 0 0 0 0 90 % D lgpg_regions spec_dataseg_int 512 512 512 0 8E-1 B strict_maxclient 1 1 1 0 1
boolean D strict_maxperm strict_maxperm 0 0 0 0 1
boolean D strict_maxclient vm_modlist_threshold -1 -1 -1 -2 2G-1 D vmm_fork_policy 1 1 1 0 1
boolean D vmm_mpsize_support 2 2 2 0 2 numeric B vmm_vmap_policy 0 0 0 0 1
boolean D
and I have the most up to date fixes I can find for this TL/SP combination:
>emgr -l ID STATE LABEL INSTALL TIME UPDATED BY ABSTRACT === ===== ========== ================= ========== ====================================== 1 S IZ60190 12/02/10 14:40:41 Ifix
for an apar IZ60190 2 S z8703761F3 12/02/10 14:44:35 Ifix
for apar IZ87037 at 61F SP03.
#AIX-Forum